• 검색 결과가 없습니다.

Least Squares

N/A
N/A
Protected

Academic year: 2021

Share "Least Squares"

Copied!
20
0
0

로드 중.... (전체 텍스트 보기)

전체 글

(1)

Least Squares

Wanho Choi

(wanochoi.com)

(2)

Linear Least Squares Problem

=

Ax

= b

m

> n

(# of equations) > (# of unknowns)

over-determined system

A : m

× n, x :n ×1, b : m ×1

(

)

A x

b

No Exact Solution

(3)

The Nearest Solution

Ax

≈ b

⇔ argmin

x

Ax

− b

2

residual: r

= Ax − b

A : m

× n, x :n ×1, b : m ×1

(

)

(4)

How to Solve

Normal equations

QR decomposition

SVD

(Singular Value Decomposition)

faster cheaper

more accurate more reliable

(5)

Normal Equation

r

22

= r

T

r

= Ax − b

(

)

T

(

Ax

− b

)

= x

(

T

A

T

− b

T

)

(

Ax

− b

)

= x

T

A

T

Ax

− x

T

A

T

b

− b

T

Ax

+ b

T

b

= x

T

A

T

Ax

− x

T

A

T

b

− x

T

A

T

b

+ b

T

b

(

∵b

T

Ax: scalar

)

= x

T

A

T

Ax

− 2x

T

A

T

b

+ b

T

b

∂x

r

2 2

= 2A

T

Ax

− 2A

T

b

= 0

∴A

T

Ax

= A

T

b

: normal equations

(6)

Geometric Derivation

Ay

( )

⋅ b − Ax

(

)

= Ay

( )

T b − Ax

(

)

= yT AT

(

b − Ax

)

= 0 b p = Ax r = b − Ax Ay= a11 a12 ! a1n a21 a22 ! a2n ! ! " ! am1 am 2 ! amn ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ y1 y2 ! yn ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ = y1 a11 a21 ! am1 ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ + y2 a12 a22 ! am 2 ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ +!+ yn a1n a2n ! amn ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥

Ay : the column space of A

∴A

T

b

− Ax

(

)

= 0 ⇔ A

T

Ax

= A

T

b

r is minimized when y=x

(7)

Pseudo Inverse

p

= A A

( )

T

A

−1

A

T

b

A

x

P

A

T

Ax

= A

T

b

A

T

A

( )

−1

A

T

A

( )

x

= A

( )

T

A

−1

A

T

b

x

= A

( )

T

A

−1

A

T

b

p

= Ax = A A

( )

T

A

−1

A

T

b

b

pseudo inverse

(8)

Condition Number of a Function

It measures

• How much the output value changes for small change of input

• How sensitive to changes or errors in the input

• How much error in the output results from an error in the input

cond(A)

=

κ

=

σ

max

(A)

σ

min

(A)

σi(A) : the i-th singular value of A

Δx

x

≤ cond(A)

ΔA

(9)

Condition Number of a Function

Well-conditioned: low condition number

Ill-conditioned: high condition number

The normal equations are sometimes ill-conditioned.

A

= 1+10

−10

−1

−1

1

A

T

A

= 2 + 2 ⋅10

−10

+10

−20

−2 −10

−10

−2 −10

−10

2

cond(A

T

A)

= cond(A)

2

(10)

QR Decomposition

∵Orthogonal transformation preserve the Euclidean norm.

( ) ∵A=QU ( )

r

22

= Ax − b

22

= QUx − b

22

= Q

T

QUx

− Q

T

b

2 2

= Ux − Q

T

b

2 2

∴Ux = Q

T

b

U

=

* *

! *

0 *

! *

! ! " !

0 0

! *

0 0

! 0

! ! " !

0 0

! 0

r

m

n

(11)

Singular Value Decomposition

1 1 0 1 −1 1 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ = 0.5774 0.7071 0.4082 0.5774 0 −0.8165 0.5774 −0.7071 0.4082 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ 1.7321 0 0 1.4142 0 0 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ 0 1 1 0 ⎡ ⎣ ⎢ ⎤ ⎦ ⎥

A

= USV

T

m × m m × n m × n n × n ex)

(12)

Singular Value Decomposition

A

= USV

T

m × m m × n m × n n × n S = D 0 0 0 ⎡ ⎣ ⎢ ⎤ ⎦ ⎥ = σ1 0 ! 0 0 σ2 ! 0 " " # " 0 0 ! σr ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥

σi(A) : the i-th singular value of A

i = 0,1,!,r = min(m,n)

r = rank(A) if A is full rank

(13)

Singular Value Decomposition

A

= USV

T

m × m

m × n m × n n × n

S : singular values of A eigenvalues of AA

(

Tor ATA

)

U : left-hand singular vectors of A eigenvectors of AA

(

T

)

V : right-hand singular vectors of A eigenvectors of A

(

TA

)

AAT = USV

(

T

)

(

USVT

)

T = USV

(

T

)

(

VSTUT

)

= USVTVSTUT = US2UT ATA = USV

(

T

)

T

(

USVT

)

= VS

(

TUT

)

(

USVT

)

= VSTUTUSVT = VS2VT λ1 ≥ λ2 ≥! ≥ λr = λr+1 = ! = λn = 0

(14)

EVD vs SVD

Singular Value Decomposition lets us

decompose any matrix A as a  UΣVT where U and V are

orthogonal and Σ is a diagonal matrix whose non-zero entries are square roots of the eigenvalues of ATA. The columns

of U and V give bases for the four fundamental subspaces. If A is symmetric, A is diagonalizable. In other words, if A is symmetric, we can find an orthonormal matrix Q so

that A=QΛQT. Here Λ is a diagonal matrix that has eigenvalues

of A. So, it is called eigenvalue decomposition.

Furthermore, when A is symmetric and positive semi-definite: (i.e. A=QΛQT=QΛ1/2Λ1/2QT (∵∀𝝀≥0) =QΛ1/2QTQΛ1/2QT=A1/2A1/2)

ATA=(QΛQT)TQΛQT=QΛQTQΛQT=QΛ2QT

AAT=QΛQT(QΛQT)T=QΛQTQΛQT=QΛ2QT

(15)

Pseudo Inverse

A

+

= A

( )

T

A

−1

A

T

= VS

−1

U

T

A

= USV

T

m × m

m × n m × n n × n

from normal equations ATAx = ATb

∴x = AT A

( )

−1 ATb ATA

( )

−1 AT = USV

(

(

T

)

T

(

USVT

)

)

−1 USVT

(

)

T = VST UTUSVT

(

)

−1 VSTUT

(

)

= VS

(

2VT

)

−1

(

VSUT

)

= V−TS−2V−1VSUT = VS−1UT

(16)

Comparison

The standard recommendation for linear

least-squares is to use QR factorization.

These days QR seems to be the consensus for most

practical problems.

Cholesky is cheaper but suffers from "conditioning

squared" due to the fact that you have to operate on

(X′X)(X′X), not XX.

SVD has some numerical advantages, but is more

costly than QR decomposition.

(17)

Eigen3 Example

2x − y = 2 2x + y = 2 x − 3y = −3 2 −1 1 −3 2 1 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ x y ⎡ ⎣ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ = 2 −3 2 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥

(18)

Eigen3 Example

#include <iostream> #include <Eigen/Dense>

int main( int argc, char* argv[] ) { Eigen::MatrixXd A( 3, 2 ); A(0,0) = 2; A(0,1) = -1; A(1,0) = 1; A(1,1) = -3; A(2,0) = 2; A(2,1) = 1; Eigen::VectorXd b ( 3 ); b(0) = 2; b(1) = -3; b(2) = 2; Eigen::VectorXd x( 2 );

x = ( A.transpose() * A ).ldlt().solve( A.transpose() * b ); // normal equations x = A.colPivHouseholderQr().solve( b ); // QR decomposition

x = A.jacobiSvd( Eigen::ComputeThinU | Eigen::ComputeThinV ).solve( b ); // SVD std::cout << x << std::endl;

return 0; }

(19)

Eigen3 Example

2x − y = 2 2x + y = 2 x − 3y = −3 2 −1 1 −3 2 1 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ x y ⎡ ⎣ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ = 2 −3 2 ⎡ ⎣ ⎢ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ ⎥ x y ⎡ ⎣ ⎢ ⎢ ⎤ ⎦ ⎥ ⎥ = 0.911111.06667 ⎡ ⎣ ⎢ ⎤ ⎦ ⎥

(20)

참조

관련 문서

undergo a cyclic process whose sole effects are the flow of heat into the system from a cold. reservoir and the flow of an equivalent amount of heat out of the system into

1 John Owen, Justification by Faith Alone, in The Works of John Owen, ed. John Bolt, trans. Scott Clark, &#34;Do This and Live: Christ's Active Obedience as the

Frame: a single word written around the circumference of a square, in the fashion of a 'Perfect' Double Acrostic.. I have also discovered that some of the Abramelin

Two matrices A=[a jk ] and B=[b jk ] are equal, written A=B, if and only if they have the same size and the corresponding entries are equal, that is, a 11 =b 11 , a 12 =b 12 ,

A stress component is positive when it acts in the positive direction of the coordinate axes, and on a plane whose outer normal points in one of the positive coordinate

: Development of Microstructure and Alteration of Mechanical Properties.. 4.6 The homogeneous nucleation rate as a function of undercooling ∆T. ∆T N is the critical

Lagrange dual Duality Proof of strong duality Optimality conditions Theorems of alternatives.. Lagrange dual functions Lower bounds on

~ a volume which is fixed in space and through whose boundary matter, mass, momentum, energy can flow. ~ The boundary of control volume is