PDF Orthogonality and Least Squares

(1)

6

6.3

Orthogonality and Least Squares

ORTHOGONAL PROJECTIONS

(2)

ORTHOGONAL PROJECTIONS

▪ The orthogonal projection of a point in onto a line through the origin has an important analogue in .

▪ Given a vector y and a subspace W in , there is a vector in W such that (1) is the unique vector in W for which is orthogonal to W, and (2) is the unique vector in W closest to y. See the following figure.

2

n

y ˆ y ˆ

− ˆ

y y y ˆ

(3)

THE ORTHOGONAL DECOMPOSITION THEOREM

▪ These two properties of provide the key to finding the least-squares solutions of linear systems.

▪ Theorem 8: Let W be a subspace of . Then each y in can be written uniquely in the form

----(1) where is in W and z is in .

▪ In fact, if {u₁,…,u_p} is any orthogonal basis of W, then

----(2)

y ˆ

n n

= + ˆ y y z

y ˆ W

^⊥

1

1 1

ˆ = + +

^p _p

p p

y u y u

y u u

u u u u

= − ˆ

(4)

THE ORTHOGONAL DECOMPOSITION THEOREM

▪ The vector in (1) is called the orthogonal

projection of y onto W and often is written as proj_Wy.

See the following figure.

▪ Proof: Let {u₁,…,u_p} be any orthogonal basis for W, and define by (2).

▪ Then is in W because is a linear combination of the basis u₁,…,u_p.

y ˆ

y ˆ y ˆ

(5)

▪ Let .

▪ Since u₁ is orthogonal to u₂,…,u_p, it follows from (2) that

▪ Thus z is orthogonal to u₁.

▪ Similarly, z is orthogonal to each u_j in the basis for W.

▪ Hence z is orthogonal to every vector in W.

▪ That is, z is in .

= − ˆ z y y

1

1 1 1 1 1

1 1

( ˆ ) 0 0

0

 

= − = −   − − −

 

= − =

z u y y u y u y u u u u u

y u y u

W

^⊥

(6)

THE ORTHOGONAL DECOMPOSITION THEOREM

▪ To show that the decomposition in (1) is unique,

suppose y can also be written as , with in W and z₁ in .

▪ Then (since both sides equal y), and so

▪ This equality shows that the vector is in W and in (because z₁ and z are both in , and is a subspace).

▪ Hence , which shows that .

▪ This proves that and also .

1 1

= ˆ + y y z W

^⊥

1 1

ˆ + = ˆ + y z y z

1 1

ˆ ˆ − = − y y z z

ˆ ˆ

1

= − v y y

W

^⊥

W

^⊥

W

^⊥

= 0

v v v = 0

ˆ = ˆ

1

y y z

₁

= z

ˆ

1

y

(7)

THE ORTHOGONAL DECOMPOSITION THEOREM

▪ The uniqueness of the decomposition (1) shows that the orthogonal projection depends only on W and not on the particular basis used in (2).

▪ Example 1: Let .

Observe that {u₁, u₂} is an orthogonal basis for

. Write y as the sum of a vector in W and a vector orthogonal to W.

y ˆ

1 2

2 2 1

5 , 1 ,and 2

1 1 3

    −  

     

=   =   =  

  −    

     

u u y

1 2

Span{ , }

=

W u u

(8)

▪ Solution: The orthogonal projection of y onto W is

▪ Also

1 2

1 1 2 2

ˆ

2 2 2 2 2 / 5

9 3 9 15

5 1 5 1 2

30 6 30 30

1 1 1 1 1 / 5

= +

− − −

         

         

=   +   =   +    = 

− −

         

          y u y u

y u u

u u u u

1 2 / 5 7 / 5

ˆ 2 2 0

3 1 / 5 14 / 5

   −   

     

− =    −   = 

     

     

y y

(9)

THE ORTHOGONAL DECOMPOSITION THEOREM

▪ Theorem 8 ensures that is in .

▪ To check the calculations, verify that is

orthogonal to both u₁ and u₂ and hence to all of W.

▪ The desired decomposition of y is

− ˆ

y y W

^⊥

− ˆ y y

1 2 / 5 7 / 5

2 2 0

3 1 / 5 14 / 5

   −   

     

=    =   + 

     

     

y

(10)

PROPERTIES OF ORTHOGONAL PROJECTIONS

▪ If {u₁,…,u_p} is an orthogonal basis for W and if y happens to be in W, then the formula for proj_Wy is exactly the same as the representation of y given in Theorem 5 in Section 6.2.

▪ In this case, .

▪ If y is in , then .

proj

_W

y = y

Span{ ,

1

, }

=

_p

W u u proj

_W

y = y

(11)

THE BEST APPROXIMATION THEOREM

▪ Theorem 9: Let W be a subspace of , let y be any

vector in , and let be the orthogonal projection of y onto W. Then is the closest point in W to y, in the

sense that

----(3) for all v in W distinct from .

▪ The vector in Theorem 9 is called the best approximation to y by elements of W.

▪ The distance from y to v, given by , can be regarded as the “error” of using v in place of y.

▪ Theorem 9 says that this error is minimized when .

n

y ˆ

n

y ˆ

−  − ˆ y y y v y ˆ

y − v

= ˆ

v y

(12)

THE BEST APPROXIMATION THEOREM

▪ Inequality (3) leads to a new proof that does not depend on the particular orthogonal basis used to compute it.

▪ If a different orthogonal basis for W were used to construct an orthogonal projection of y, then this

projection would also be the closest point in W to y, namely, .

y ˆ

(13)

THE BEST APPROXIMATION THEOREM

▪ Proof: Take v in W distinct from . See the following figure.

▪ Then is in W.

▪ By the Orthogonal Decomposition Theorem, is orthogonal to W.

▪ In particular, is orthogonal to (which is in W ).

y ˆ

ˆ − y v

− ˆ y y

− ˆ

y y y ˆ − v

(14)

THE BEST APPROXIMATION THEOREM

▪ Since

the Pythagorean Theorem gives

▪ (See the colored right triangle in the figure on the previous slide. The length of each side is labeled.)

▪ Now because , and so inequality (3) follows immediately.

ˆ ˆ

( ) ( )

− = − + − y v y y y v

2 2 2

ˆ ˆ

− = − + −

y v y y y v

ˆ − 2  0

y v

y ˆ −  v 0

(15)

PROPERTIES OF ORTHOGONAL PROJECTIONS

▪ Example 2: The distance from a point y in to a subspace W is defined as the distance from y to the nearest point in W. Find the distance from y to

, where

▪ Solution: By the Best Approximation Theorem, the distance from y to W is , where .

n

1 2

Span{ , }

=

W u u

1 2

1 5 1

5 , 2 , 2

10 1 1

  −    

     

= −   = −   =  

      −

     

y u u

− ˆ

y y y ˆ = proj

_W

y

(16)

▪ Since {u₁, u₂} is an orthogonal basis for W,

▪ The distance from y to W is .

1 2

2 2 2

5 1 1

15 21 1 7

ˆ 2 2 8

30 6 2 2

1 1 4

1 1 0

ˆ 5 8 3

10 4 6

ˆ 3 6 45

      −

−      

= + =   − −     = −

      −

     

− −

     

     

− = − − − =      

     

     

− = + =

y u u

y y

45 = 3 5

(17)

▪ Theorem 10: If {u₁,…,u_p} is an orthonormal basis for a subspace W of , then

----(4) If , then

for all y in ----(5)

▪ Proof: Formula (4) follows immediately from (2) in Theorem 8.

n

1 1 2 2

proj

_W

y = ( y u u ) + ( y u u ) + + ( y u u

_p

)

_p

1 2

 

= 

^p



U u u u

proj

_W

y = UU

^T

y

n

(18)

PROPERTIES OF ORTHOGONAL PROJECTIONS

▪ Also, (4) shows that proj_Wy is a linear combination of

the columns of U using the weights .

▪ The weights can be written as ,

showing that they are the entries in U^Ty and justifying (5).

1

,

2

, ,

_p

y u y u y u

1^T

,

^T2

, ,

^T_p

u y u y u y