Camera Calibration
Wanho Choi
(wanochoi.com)
• How to solve Ax=b : http://wanochoi.com/lecture/Ax=b.pdf • Least Squares : http://wanochoi.com/lecture/Least_Squares.pdf • Transformation Matrix : http://wanochoi.com/lecture/TransformationMatrix.pdf • Rotation : http://wanochoi.com/lecture/Rotation.pdf
Preliminaries
How to capture? [Idea #1]
film
(photon sensor)
the captured image
no image fully blurred
object
the sun
How to capture? [Idea #2]
the captured image object film barrier a small hole (known as aperture) (photon sensor) upside-down sharp
but, insufficient light
requires large exposure time
the sun
How to capture? [Idea #3]
film
lens (photon sensor)
the captured image
upside-down sharp sufficient light object the sun (light source)
Convex Lens Formula
H h a b f α α β β tanα = H a = hb tanβ = Hf = hb − f h H = ba h H = b − ff = bf − 1 ∴ b a = bf − 1 ⇒ ⇒ b f = ba + 1 = a + ba ⇒ ∴ 1f = a + bab = 1a + 1b upside-down upright : focal length focal point tanα = H a tanα = hb tanβ = hb − f tanβ = H f optical centerAperture (
조리개)
• 렌즈(lens)로 들어오는 광량(amount of light)을 조절한다.
• 또한, 피사계 심도(depth-of-field)도 이에 영향을 받는다.
: 초점이 맞은 것으로 인식되는 (acceptably sharp) 거리의 범위
• 구멍의 크기가 작을수록 핀홀 카메라(pinhole camera)에 가까워 진다.
Aperture (
조리개)
Aperture (
조리개)
Ideal Thin Lens vs Real Lens
spherical aberration chromatic aberration
https://expertphotography.com/chromatic-aberration-photography/ https://en.wikipedia.org/wiki/Spherical_aberration http://www.drewgrayphoto.com/learn/distortion101
• Pinhole as a point
• Pencil of rays: 모든 광선(ray)은 한 점(focal point)을 통과
• One ray per each point
Pinhole Camera
image plane pinhole virtual image• “어두운 방”이라는 뜻
• 그림 등을 그리기 위해 만든 광학 장치로, 사진술의 전신
Camera Obscura
Camera Obscura
Homogeneous Coordinate System
• 유클리디언 기하학(Euclidean geometry)에서 사용하는 좌표계 : 데카르트 좌표계 (cartesian coordinate system)
• 사영 기하학(projective geometry)에서 사용하는 좌표계
: 동차 좌표계 (homogeneous coordinate system)
(x, y, z) (x, y, z,1) (x, y, z, w)
→
→
→
4x4 or 3x4 matrixcomputation→
(x/w, y/w, z/w)Homogeneous Coordinate
• The 2D point (x, y) is represented by the homogeneous coordinate (x, y,1).
• In general, the homogeneous coordinate (x, y, w) represents the 2D point (x/w, y/w).
Pinhole Camera Model
XC YC ZC 1 = r11 r12 r13 t1 r21 r21 r23 t2 r31 r32 r33 t3 0 0 0 1 X Y Z 1 ˜x ˜y ˜z = f 0 0 0 0 f 0 0 0 0 1 0 XC YC ZC 1 (x, y) = (˜x/˜z, ˜y/˜z) ˜u ˜v ˜w = sx sθ uc 0 sy vc 0 0 1 ˜x ˜y ˜z (u, v) = (˜u/˜w , ˜v/˜w ) camera space (3D) ➔ image plane space (2D)world space (3D) ➔ camera space (3D)
image plane space (2D) ➔ pixel space (sensor) (2D)
y z O = C0 : optical center : optical axis P(XC, YC, ZC) p = ? f: focal length image plane p = [xy] ≡ ˜x ˜y ˜z = f XC/ZC f YC/ZC 1 ≡ f Xf YCC ZC = 0 f 0 0f 0 0 0 0 0 1 0 XC YC ZC 1 pixel scale image center skewness
(usually negligible or zero)
rotation translation projection ˜x ˜y ˜z = f 0 0 0 0 f 0 0 0 0 1 0 XC YC ZC 1 f 0 0 0 f 0 0 0 1 [ 1 0 0 0 0 1 0 0 0 0 1 0] zooming 3D to 2D
: standard (or canonical) projection matrix
Pinhole Camera Model
˜u
˜v
˜
w
=
s
xs
θu
c0 s
yv
c0 0 1
f 0 0 0
0 f 0 0
0 0 1 0
r
11r
12r
13t
1r
21r
21r
23t
2r
31r
32r
33t
30 0 0 1
X
Y
Z
1
Pinhole Camera Model
˜u
˜v
˜
w
=
s
xs
θu
c0 s
yv
c0 0 1
f 0 0 0
0 f 0 0
0 0 1 0
r
11r
12r
13t
1r
21r
21r
23t
2r
31r
32r
33t
30 0 0 1
X
Y
Z
1
K: intrinsic parameters W: extrinsic parameters
Pinhole Camera Model
˜u
˜v
˜
w
=
s
xs
θu
c0 s
yv
c0 0 1
f 0 0 0
0 f 0 0
0 0 1 0
r
11r
12r
13t
1r
21r
21r
23t
2r
31r
32r
33t
30 0 0 1
X
Y
Z
1
K: intrinsic parameters W: extrinsic parameters
C: camera matrix
K =
α γ u
c0 β v
c0 0 1
=
s
xs
θu
c0 s
yv
c0 0 1
f 0 0 0
0 f 0 0
0 0 1 0
Homography
• Transformation between two different planes
• Homography matrix
‣ 3x3 square matrix
‣ But, 8 DoF as it is estimated up to a scale
‣ It is generally with . s x′y′ 1 = H [ x y 1] = h11 h12 h13 h21 h22 h23 h31 h32 h33 [ x y 1] h33 = 1
The Examples of Homography
DLT (Direct Linear Transformation)
˜u
˜v
˜
w
=
α γ u
c0 β v
c0 0 1
r
11r
12r
13t
1r
21r
21r
23t
2r
31r
32r
33t
30 0 0 1
X
Y
Z
1
11 unknowns (11 D.O.F.) rx, ry, rz, tx, ty, tz α, β, γ, u5 unknownsc, vc 6 unknowns observed image point (measure) known control point (given)DLT (Direct Linear Transformation)
C: camera matrix˜u
˜v
˜
w
=
α γ u
c0 β v
c0 0 1
r
11r
12r
13t
1r
21r
21r
23t
2r
31r
32r
33t
30 0 0 1
X
Y
Z
1
3 × 1p = CP
3 × 4 4 × 1DLT (Direct Linear Transformation)
p = CP
˜u
˜v
˜
w
=
C
11C
12C
13C
14C
21C
22C
23C
24C
31C
32C
33C
34X
Y
Z
1
u = ˜u˜w = CC11X + C12Y + C13Z + C14 31X + C32Y + C233Z + C34 v = ˜v˜w = CC21X + C22Y + C23Z + C24 31X + C32Y + C33Z + C34 So, we need at least 6 point pairs.DLT (Direct Linear Transformation)
u = CC11X + C12Y + C13Z + C14 31X + C32Y + C33Z + C34 v = CC21X + C22Y + C23Z + C24 31X + C32Y + C33Z + C34 [−X −Y −Z −1 00 0 0 0 −X −Y −Z −1 vX vY vZ v]0 0 0 uX uY uZ u C11 C12 C13 C14 C21 C22 C23 C24 C31 C32 C33 C34 = 0DLT (Direct Linear Transformation)
• For N-point pairs
−X1 −Y1 −Z1 −1 0 0 0 0 u1X1 u1Y1 u1Z1 u1 0 0 0 0 −X1 −Y1 −Z1 −1 v1X1 v1Y1 v1Z1 v1 −X2 −Y2 −Z2 −1 0 0 0 0 u2X2 u2Y2 u2Z2 u2 0 0 0 0 −X2 −Y2 −Z2 −1 v2X2 v2Y2 v2Z2 v2 ⋮ ⋮ −XN −YN −ZN −1 0 0 0 0 uNXN uNYN uNZN uN 0 0 0 0 −XN −YN −ZN −1 vNXN vNYN vNZN vN C11 C12 C13 C14 C21 C22 C23 C24 C31 C32 C33 C34 = 0 12N × 12 12 × 1
Mc = 0
DLT (Direct Linear Transformation)
Mc = 0 Mc = w
!
̂c = argmin
c
(w
T
w)
wTw = (Mc)T(Mc) = cTMTMc = cT(USVT)T (USVT) c= cT(VSUT) (USVT) c = cTVSUTUSVTc
= cTVS2VTc = cT ( 12 ∑ i=1 si2viviT ) c
: SVD (Singular Value Decomposition)
: the 12th (=smallest) eigenvector of V
= cT (s2
1v1v1T + s22v2v2T + ⋯ + s122 v12v12T ) c
∴ ̂c = v12 ( ∵ vTi vj = 0)
If individual parameters are needed
C = [H|h] = KR [I| − C
0] = H [I| − C
0]
knownH = QR
: QR decomposition∴ K = 1
Q
33Q
h = − HC
0∴ C
0= − Hh
: homogeneity normalization• Checkerboard
‣ Size & structure are known.
‣ Easy to set & get the points.
Camera Calibration using 2D Pattern
• All points are on a plane, so Z=0.
x
y
z
Camera Calibration using 2D Pattern
• All points are on a plane, so Z=0.
• We cannot solve the problem with general DLT process.
−X1 −Y1 −Z1 −1 0 0 0 0 u1X1 u1Y1 u1Z1 u1 0 0 0 0 −X1 −Y1 −Z1 −1 v1X1 v1Y1 v1Z1 v1 −X2 −Y2 −Z2 −1 0 0 0 0 u2X2 u2Y2 u2Z2 u2 0 0 0 0 −X2 −Y2 −Z2 −1 v2X2 v2Y2 v2Z2 v2 ⋮ ⋮ −XN −YN −ZN −1 0 0 0 0 uNXN uNYN uNZN uN 0 0 0 0 −XN −YN −ZN −1 vNXN vNYN vNZN vN C11 C12 C13 C14 C21 C22 C23 C24 C31 C32 C33 C34 = 0 rank deficiency!
A Simple Trick!
˜u
˜v
˜
w
=
α γ u
c0 β v
c0 0 1
r
11r
12r
13t
1r
21r
21r
23t
2r
31r
32r
33t
30 0 0 1
X
Y
Z
1
˜u
˜v
˜
w
=
α γ u
c0 β v
c0 0 1
r
11r
12t
1r
21r
21t
2r
31r
32t
3[
X
Y
1]
H = [h
1, h
2, h
3] = K[r
1, r
2, t]
8 unknowns (8 D.O.F.)Homography
• Linear transformation between two different planes
p = HP
[
u
v
1]
= [h
1, h
2, h
3] [
X
Y
1]
[
u
v
1]
=
h
11h
12h
13h
21h
22h
23h
31h
32h
33[
X
Y
1]
observed image point (measure) known control point (given) 8 unknowns= 1
So, we need at least 4 point pairs.
How to get K, R, and T from H
H = [h
1, h
2, h
3] = K[r
1, r
2, t]
rotation matrix가 아니기 때문에 QR decomposition 사용 불가능r
1= K
−1h
1r
2= K
−1h
2r
T1r
2= 0
r
T 1r
1= r
T2r
2= 1
(K−1h1)T(K−1h2) = 0 이 관계로 부터 다음과 같이 2개의 제약조건(constraints)을 얻을 수 있다. (K−1h1)T(K−1h1) = (K−1h2)T(K−1h2) hT1K−TK−1h2 = 0 hT1K−TK−1h1 − hT2K−TK−1h2 = 0∴
How to get K, R, and T
hT1K−TK−1h2 = 0
hT
1K−TK−1h1 − hT2K−TK−1h2 = 0
: symmetric positive definite matrix
B = b11 b12 b13 b21 b22 b23 b31 b32 b33 ∵ B = K−TK−1 = (K−T) (K−T)T = AAT : Cholesky decomposition ∴ b := [b11 b12 b13 b22 b23 b33]T B := K−TK−1 h T 1Bh2 = 0 hT 1Bh1 − hT2Bh2 = 0
How to get K, R, and T
hT1K−TK−1h2 = 0 hT 1K−TK−1h1 − hT2K−TK−1h2 = 0 B := K−TK−1 h T 1Bh2 = 0 hT 1Bh1 − hT2Bh2 = 0 b := [b11 b12 b13 b22 b23 b33]T vT 12b = 0 (v11 − v22)T b = 0 vTij = [h1ih1j h1ih2j + h2ih1j h3ih1j + h1ih3j h2ih2j h3ih2j + h2ih3j h3ih3j]T [ vT 12 (v11 − v12)T] b = 0How to get K, R, and T
the 1st point pair
the n-th point pair
vT12 (v11 − v12)T ⋯ vT 12 (v11 − v12)T b = 0
Vb = 0
Vb = 0 Vb = w
!
̂b = argmin
b
(w
T
w)
→
→
If individual parameters are needed
H = [h
1, h
2, h
3] = K[r
1, r
2, t]
r
1= K
−1h
1r
2= K
−1h
2B = K
−TK
−1 : Cholesky decompositionB = K
−TK
−1= AA
T∴ K = A
−Tr
1= K
−1h
1&
r
2= K
−1h
2h
3= Kt → t = K
−1h
3• Distortion: non-linear error
• Especially radial distortion
• So, we only consider the first two terms of radial distortion.
• The distortion function is dominated by the radial components, and especially dominated by the first term.
• Moreover, more elaborated model would cause numerical instability.
Dealing with Radial Distortion
˘x = x + x [k1(x2 + y2)2+k2(x2 + y2)2]
˘y = y + y [k1(x2 + y2)2+k2(x2 + y2)2] (˘x, ˘y)
(x, y): ideal (distortion-free) point : real (distorted) point
˜u ˜v 1 = α 0 uc 0 β vc 0 0 1 ˜x ˜y 1 ˜u = uc + α ˘x ˜v = vc + α˘y ˘u = u + (u − u0) x [k1(x2 + y2)2+k2(x2 + y2)2] ˘v = v + (v − v0) x [k1(x2 + y2)2+k2(x2 + y2)2] [(u − u 0) (x2 + y2) (u − u0) (x2 + y2) (v − v0) (x2 + y2) (v − v0) (x2 + y2)] [ k1 k2] = [˘v − v]˘u − u
Maximum Likelihood Estimation
• Non-linear optimization problem
• Levenberg-Marquardt algorithm
• Initial guess from DLT
N