Fixed Point Iteration

(1)

Iteration

Suppose we wish to solve f (x) = 0 where:

f (x) = x −1 2(x +A

x)

Here A > 0 is some given constant. Obviously this equation has two solutions, namely x = ±A¹²

This formula is credited to Heron (a common name) of Alexandria, a Greek engineer and architect who lived somewhere between 100B.C. and A.D. 100. Incidentally the Babylonians also knew of the formula 2000 years earlier.

Let g(x) =¹₂(x +^A_x) so that f(x) = 0 is equivalent to x = g(x)

since

f (x) = 0 = x −1 2(x +A

x) x =1

2(x +A

x) = g(x) Let x0> 0 be a first approximation to A¹². define x1= g(x0), x2= g(x1), . . . , xn+1= g(xn).

This is known as successive substituion or iteration.

1

Iteration: Example

f (x) = x −1 2(x +A

x), A = 5, x0= 2

We first need to find a g(x) such that f (x) = 0 is that same as x = g(x)

So in our case as we have seen g(x) =1

2(x +A x)

We also would like to be able to measure the accuracy of our solution. In this case n= |x²n− 5|.

So the sequence {xn} is:

x0= 2 0= 1

x1=¹₂(2 +⁵₂) =⁹₄ 1= (¹₄)² x2=¹₂(⁹₄+⁵9

4) = ¹⁶¹₇₂ 2= (₇₂¹)² x3=¹₂(¹⁶¹₇₂ + 161⁵

72) =⁵¹⁸⁴¹₂₃₁₈₄ 3= (₂₃₁₈₄¹ )²

2

CIS 541 Iteration

Iteration: Example

Say we want to compute square roots and use Newton’s Method. Let R > 0, and let’s set x =p(R).

Then x is a root of the equation x²− R = 0.

Let’s apply Newton’s method.

f (x) = x²− R f⁰(x) = 2x

xn+1= xn−f^{f (x}⁰(xⁿ_n⁾)

= xn−^x2x²ⁿ^−R_n

=^2x²ⁿ^−x_2x_n²ⁿ^−R

=^x_2x²ⁿ^−R_n

=¹₂(xn−_x^R_n)

We’ve seen this!

CIS 541 Iteration

Iteration: Example

If we want to calculate √

17 with an initial approximation of 4, the sequence {xn} is:

x0= 4 x1= 4.12 x2= 4.123106

x3= 4.1231056256177

x4= 4.123105625617660549821409856

So the number of correct digits for the sequence is 1, 3, 6, 14, 28. We converge very rapidly and see the doubling of correct digits.

(2)

CIS 541 Iteration

Fixed Point Iteration

We compute a sequence of points with a formula of the form

xn+1= g(xn)

The algorithm defined by such an equation is called functional iteration. In Newton’s method g is:

g(x) = x − f (x) f⁰(x)

With our iteration formula we can generate sequences that don’t converge, or ones that do, which is what we are interested in. If the sequence {xn} converges then limn→∞xnexists. Suppose that limn→∞xn= s.

How does s and g relate if g is continous?

g(s) = g( lim

n→∞xn) = lim

n→∞g(xn) = lim

n→∞xn+1= s So g(s) = s, and we call s a fixed point of g, that is s remains fixed under successive application of the function g(x).

5

CIS 541 Iteration

Contractive Mapping Theorem

A mapping F is said to be contractive if there exists a number λ < 1 such that

|F (x) − F (y)| ≤ λ|x − y|

s

x

s

y F (x) s

F (y) s

Contractive Mapping Theorem Let C be a closed subset of the real line. If F is a contractive mapping of C into C, then F has a unique fixed point and this fixed point is the limit of every sequence obtained by xn+1= F (xn) with a starting point x0∈ C

6

CIS 541 Iteration

Contractive Mapping Theorem: Proof

Lets use the contractive property and the iteration formula to obtain:

Repeat again to get:

|xn−xn−1| ≤ λ|xn−1−xn−2| ≤ λ²|xn−2−xn−3| ≤ . . .

≤ λⁿ⁻¹|x¹− x⁰| Let’s rewrite xn in the form:

xn= x0+ (x1− x⁰) + (x2− x¹) + . . . + (xn− xⁿ⁻¹)

So the sequence {xⁿ} converges if and only if the series

∞

X

n=1

(xn− xn−1)

CIS 541 Iteration

converges. To prove it converges it is enough to prove

that _∞

X

n=1

|xⁿ− xⁿ⁻¹| converges. Using the above comparison

∞

X

n=1

|xn− xn−1| ≤

∞

X

n=1

λⁿ⁻¹|x1− x0| = 1

1 − λ|x1− x0| Which converges. Since the sequence converges, let s = limn→∞xn, so F (s) = s as shown earlier. So the contractive property implies that F is continuous. So is there a unique fixed point? Let x and y be fixed points so.

|x − y| = |F (x) − F (y)| ≤ λ|x − y|

Since λ < 1, |x − y| = 0. s belongs to C since s is the limit of a sequence lying in C.

(3)

Contractive Mapping Theorem: Example

Prove the sequence {Xn} defined as follows converges.

x0= −15

xn+1= 3 −¹2|xⁿ| n ≥ 0

|F (x) − F (y)| = |3 −1

2|x| − 3 +1 2|y|| =1

2||y| − |x||

and by the triangle inequality 1

2||y| − |x|| ≤1 2|y − x|

So by the Contractive Mapping Theorem, the sequence converges to the unique fixed point of F , which is 2.

9

Contractive Mapping Theorem: Example

Use the contractive mapping theorem to compute the fixed point of:

F (x) = 4 +1 3sin 2x

|F (x) − F (y)| =1

3| sin 2x − sin 2y|

Use the Mean-Value Theorem

f⁰(ξ)(b − a) = f(b) − f(a) F⁰(x) = 2

3cos 2x to get

1

3| sin 2x − sin 2y| =2

3| cos 2ξ||x − y| ≤2 3|x − y|

for some ξ between x and y. This shows there is a fixed point of F

10

CIS 541 Iteration

Contractive Mapping Theorem: Example

Using a computer program with an initial value of 4.

x = 4;

m = 20;

for (k = 0; k < M; k++) { x = 4 + 1/3*sin(2*x)

cout << k << ” ” << x << endl;

}

Results in

k x

1 4.3297861 2 4.2308951 3 4.2736338

...

15 4.2614840 ...

20 4.2614837

CIS 541 Iteration

Fixed Point Theorem 2

Lets build up.

Fixed Point Theorem 1: Let g(x) be continuous on [a, b] with range contained in [a, b]. Suppose that there exists a constant 0 ≤ k < 1 such that |g⁰(x)| ≤ k for all x ∈ (a, b). Then the equation x = g(x) has exactly one soultion in [a, b]

Proof. Since the range is contained in [a, b], the solution must be in [a, b]. |g⁰(x)| ≤ k < 1 implies g⁰(x) exists and is uniformly bounded on (a, b). Suppose there were two different solutions, p and q, in [a, b].

Thus p = g(p), q = g(q), p 6= q. By the Mean-Value Theorem there is a point ξ between p and q such that

g(p) − g(q) = (p − q)g⁰(ξ) = p − q

which implies g⁰(ξ) = 1 which is a contradiction.

(4)

CIS 541 Iteration

s

a

s

b b s

b s

13

CIS 541 Iteration

Fixed Point Theorem 2

Fixed Point Theorem 2:

Let g(x) be continuous on [a, b] with range contained in [a, b]. Suppose that there exists a constant k < 1 such that |g⁰(x)| ≤ k for all x ∈ (a, b). Then, if x0 ∈ [a, b], the sequence generated by our iterative function (xn+1 = g(xn)) converges to the unique solution to x = g(x) which lies in [a, b].

Moreover an error bound is given by |xn− s| ≤ kⁿ(b − a), where s is the fixed point.

Proof: All points of the sequence are in [a, b] since xn+1= g(xn) is in [a, b] if xn is in [a, b], and since x0

is in [a, b] it follows that they all are. Let s denote the unique fixed point to g, if for some n, xn = s, then xn+1= g(xn) = g(s) = s and so xn+1= s also, so by induction all numbers in the sequence after also = s.

If this isn’t the case, that is xn6= s for all n. Using the Mean-Value Theorem on the given interval with endpoints s and xn there is some point ξ between s

14

CIS 541 Iteration

and xnsuch that

g(xn) − g(s) = (xn− s)g⁰(ξ) = xn+1− s Since g⁰ is bounded by k it follows that

|xⁿ⁺¹− s| ≤ k|xⁿ− s| ≤ k(b − a)for n = 0, 1, . . . in particular,

|x1− s| ≤ k(b − a)

|x²− s| ≤ k|x¹− s| ≤ k²(b − a) ...

|xn− s| ≤ k|xn−1− s| ≤ kⁿ(b − a)

Because |k| < 1 , limn→∞kⁿ(b − a) = 0, thus limn→∞|xⁿ− s| = 0, or lim^n→∞xn= s.

CIS 541 Iteration

Fixed Point Theorem 2: Example

Using Theorem 2 does g(x) = .5(x+5/x) converge in the interval [2, 3]

g⁰(x) = .5(1 − 5 x²)

Are the hypothesis of Theorem 2 satisfied? Is the range the same as the domain?

Determine max and min values of g(x) on [2, 3]

to verify the range is contained in [2, 3]. g⁰(x) = 0 when x = √

5. Thus the relative extrema are at x = 2, x = 5^1/2, x = 3.

g(2) = 9/4 g(5^1/2) = 5^1/2 g(3) = 7/3

So if x ∈ [2, 3] then g(x) ∈ [2, 3].

(5)

Fixed Point Theorem 2: Example

Now check to see if there is a k such that |g⁰(x)| ≤ k < 1. How?

Find max of |g⁰(x)| on [2, 3] so look at g⁰⁰(x) =_x⁵₃. There are no zeros on [2, 3] so we only need to look at endpoints for extrema.

g⁰(2) = −1/8 g⁰(3) = 2/9

Thus |g⁰(x)| ≤ k = 2/9 on [2, 3].

So by Theorem 2, if x0 ∈ [2, 3], the series {xn} converges to 5^1/2 and a bound on the error is |xⁿ− 5^1/2| ≤ (2/9)ⁿ

17

Fixed Point Theorem 3

Let x = g(x) be an equation with a solution p.

Suppose that there exists a δ > 0 and a constant k < 1 such that |g⁰(x)| ≤ k whenever |x − p| ≤ δ.

If p0 ∈ I = [p − δ, p + δ] then the sequence {pn} generated by iteration (xn+1= g(xn)) converges to p.

Proof: if x ∈ I and x 6= p then the mean value theorem applies to g(x), on the interval with end points x and p. Thus, there exists a point ξ between x and p such that

g(x) − g(p) = (x − p)g⁰(ξ) Because |g⁰(ξ)| ≤ k < 1 it follows that

|g(x) − g(p)| ≤ k|x − p| < |x − p| ≤ δ

Finally, because g(p) = p we have shown that g(x) is closer to p than is x. That is, the range of g(x), for the domain I, is contained in I.

18

CIS 541 Iteration

Fixed Point Theorem 3

Since the hypothesis of Theorem 2 are met so are the conclusions so Theorem 3 is correct.

Corollary III: Let x = g(x) be an equation with a solution p. If g⁰(x) is continuous at p and g⁰(p) = 0 then the hypothesis, and therefore the conclusion of theorem 3 are satisified for some interval about p.

This tells us that the closer we get to a solution the faster we will converge.

CIS 541 Iteration

Convergence Examples

f (x) = x³− x − 5

We want a g(x) such that x = g(x) is the same as f (x) = 0

x³− x − 5 = 0

x³− 5 = x g1(x) = x³− 5 x³− x − 5 = 0

x(x²− 1) = 5

x =_x2⁵−1 g2(x) =_x2⁵−1

x³− x − 5 = 0

x³= x + 5 g3(x) = (x + 5)¹³

f (r) = 0 r ≈ 1.9

g₁⁰(x) = 3x² g⁰₁(1.9) ≈ 10.83 g2⁰(x) =_(x^−10x2−1)² g⁰2(1.9) ≈⁻¹⁹6.8

g₃⁰(x) =¹₃(x + 5)⁻²³ g⁰₃(1.9) ≈ ¹

3(6.9)²3

(6)

CIS 541 Iteration

Region of Convergence

Let’s expand f (x) about xn

f (x) = f (xn) + f⁰(xn)(x − xⁿ) +f⁰⁰(ξ)(x − xⁿ)² 2 Now set x = α where α is a root of f (x);

f (α) = 0 = f (xn) + f⁰(xn)(α − xn) +f⁰⁰(ξ)(α − xn)² 2

f (xn) + f⁰(xn)(α − xⁿ) = −f⁰⁰(ξ)(α − xⁿ)² 2 f (xn)

f⁰(xn)+ (α − xn) = −f⁰⁰(ξ)(α − xn)² 2f⁰(xn) since xn+1= xn−f^{f (x}⁰(xⁿ_n⁾) (Newton’s method)

(α − xn+1) = −f⁰⁰(ξ)(α − xn)² 2f⁰(xn)

21

CIS 541 Iteration

Region of Convergence

(α − xⁿ⁺¹) = −f⁰⁰(ξ)(α − xn)² 2f⁰(xn) let n= |xn− α|

n+1= f⁰⁰(ξ)e²_n 2f⁰(xn)

let m = max_f00(ξ) 2f⁰(x)

in I = [α ± x⁰]

n+1= ²_n or mn+1= (mn)²

So the sequence {xⁿ} approaches α if m⁰< 1 or m|α − x0| < 1

22

CIS 541 Iteration

Region of Convergence: Example

Let f (x) = x²− C and f(α) = 0 and using the iteration formula: xn+1= ¹₂

xn+_x^C_n Case 1: x > α

xn+1=xn

2

1 + C

x²_n

= Kxn

Since _x^C2

n < 1, K < 1, and the minimum of g(x) is α at xn = α. α ≤ xⁿ⁺¹ < xn and the sequence, {x0, x1, . . .} converges to α.

Case 2: 0 < x < α

xn+1= 1 2

xn+ C

xn

The minimum of g(x) at xn = α and its value is α.

So, xn+1> α bringing the result of the first iteration to the right side of α. We know from case 1 that any iteration starting at x > α will converge to the root.

CIS 541 Iteration

Iteration: Graphical Analysis

We find a g(x) such that f (x) = 0 is that same as x = g(x)

So geometrically what does that tell us about the root of f (x)?

y = g(x) y = x

(7)

Iteration: Graphical Analysis

y = g(x) y = x

Where is the root of f (x)?

25

Iteration: Graphical Analysis

y = g(x) y = x

s

x0

26

CIS 541 Iteration

Iteration: Graphical Analysis

y = g(x) y = x

s

x0

CIS 541 Root Finding

Roots

What does it mean to be a root? Where the function crosses the X-axis

Why do we want to find roots? It is an extremely useful piece of info to know.

How would you find the intersection between two curves? Set them equal and find the root.

A root r, of function f occurs when f (r) = 0.

For example:

f (x) = x²− 2x − 3

has 2 roots at r = −1 and r = 3. We can verify by substitution. We can also look at f in its factored form.

f (x) = x²− 2x − 3 = (x + 1)(x − 3)

(8)

Bisection Method

a b

tf (a)

t

f (b)

if f (a)f (b) < 0 we have a root. Why? Yes. The signs are different so it must cross x-axis an odd # of times. Touching the axis isn’t a cross.

Can we have a root if f (a)f (b) > 0. Why? Yes if it crosses an even # of times.

If we pick an x, a < x < b, will either [a, x] or [x, b]

have a root if [a, b] has a root? Yes.

29

Bisection Method:Algorithm

Find an interval (The smaller the better) [a, b]

where the the y values at the endpoints are on different sides of the x-axis.

Pick the midpoint, c, of [a, b] and see if that is a root, if it is not see if the root is in [a, c] or [c, b] and repeat on the appropriate new interval.

30

CIS 541 Root Finding CIS 541 Root Finding

Bisection Method

Our estimate of the root, c, is:

c =a + b 2

if (f (a)f (c) < 0 root is in [a, c]

if (f (b)f (c) < 0 root is in [b, c]

No we have a new interval, if we repeat we should get closer to the root. What happens to the interval at each interation? It is halved.

ci= ai+ bi

2 c0= a0+ b0

2 c1= a1+ b1

2

(9)

Bisection Method:Convergence

What is the error of c0, c1, c2, . . . , ci?

0= |r − c0| ≤^a⁰^+b2 ⁰

1= |r − c1| ≤^a¹^+b2 ¹ ≤^a⁰₂^+b2⁰

2= |r − c2| ≤â²^+b₂ ² ≤â¹₂^+b2¹≤â⁰₂^+b3⁰

i= |r − cⁱ| ≤^a₂⁰i+1^+b⁰

So does the bisection method converge? Yes!

How fast? O(log²(n))!

So for an error tolerance of b − a

2ⁱ⁺¹ ≤

i >log(b − a) − log(2) log(2)

33

Bisection Method:Example

Find a root for f (x) = x³+ 2x²− 11x − 12 = (x + 1)(x − 3)(x + 4)

Find a root in the interval [4,5]?

f (4) =4³+2×4²−11×4−12 = 64+32−44−12 = 40

f (5) =5³+2×5²−11×5−12 = 125+50−55−12 = 108 Now What?

How about between[2, 5]

f (2) = 23 + 2 × 22 − 11 × 2 − 12 = 8 + 8 − 22 − 12 = −18

f (2 + 5

2 ) = f (3.5) = 3.53 + 2 × 3.52 − 11 × 3.5 − 12 = 16.875 f (2 + 3.5

2 ) = f (2.75) = 2.753 + 2 × 2.752 − 11 × 2.75 − 12 = −6.4 f (2.75 + 3.5

2 ) = f (3.13) = 3.9 f (2.75 + 3.13

2 ) = f (2.94) = −1.6, f (3.04 + 3.13

2 ) = f (2.94) = 1.2

34

Bisection Method:Convergence Example

Say we wanted to compute a root to 32-bit binary precision. How many iterations would be needed if a=16 and b=17?

a = (10000.0)2 and b = (10001.0)2, Thus we already know four of the binary digits, so we have 20 left. If the last one is to be correct the error must be less than 2⁻²⁰(you can also use 2⁻²¹ to be conservative) so

b−a 2ⁿ⁺¹ <

17−16 2ⁿ⁺¹ < 2⁻²⁰ 2ⁿ⁺¹> 2²⁰ n ≥ 20 or

n > log 1−log 2⁻¹⁹ log 2

Bisection Method:Implementation Issues

What are some issues to be concerned about during implementation? Starting range.

How would you check to see which side of c the root is on? equality of signs.

We can use f (a)f (c) to see if the numbers have different signs. Are there any problems with this?

overflow, underflow.

Are we concerned with the number of calls to the function to evaluate it? Why? Could be expensive.

Will this always converge? Yes if it can be started properly.

(10)

Newton’s Method

x0

vf (x0)

x1

v

x2

v

Basic idea is to approximate f (x) by a straight line l(x). Lets use a line tangent to the function at a current guess as the approximation.

37

38

Newton’s Method

The tangent line is the deriviative. So:

l(x) = f⁰(x0)(x − x0) + f (x0)

So we use the root of l(x) as a new approximation to the root of f (x) and repeat the procedure.

This gives us a sequence:

x1= x0−_f^{f (x}⁰_(x⁰₀⁾₎ x2= x1−f^{f (x}⁰(x¹₁⁾)

x3= x2−_f^{f (x}⁰_(x²₂⁾₎ ...

xn+1= xn−_f^{f (x}⁰_(xⁿ_n⁾₎

Does this method always converge? Why? No, because the derivative doesn’t really predict the future.

Newton’s Method:Bad functions

x0

s

x1

s

x2

s L

L L L L L L L L L L L L LL

HH HH HH HH HH HH HH HH H H

x0

s

x1

s

x0

s

(11)

Newton’s Method:Convergence

Doesn’t always converge. But when it does converge, how fast does it do it?

Let’s use Taylor’s series and the error term Rn. Expand about the point xn

f (x) = f (xn) + f⁰(xn)(x − xn) +f⁰⁰(ξ)(x − xn)² 2

Remember ξ is between x and xn. Let’s set x to the root r, so f (r) = 0.

f (r) = 0 = f (xn) + f⁰(xn)(r − xn) +f⁰⁰(ξ)(r − xn)² 2 Now divide by f⁰(xn)

= f (xn)

f⁰(xn)+f⁰(xn)(r − xn)

f⁰(xn) +f⁰⁰(ξ)(r − xn)² 2f⁰(xn) Reduce terms and substitute our iterative formula.

xn− xn+1+ (r − xn) +f⁰⁰(ξ)(r − xn)² 2f⁰(xn)

41

Newton’s Method:Convergence

Rearrange.

r − xⁿ⁺¹= − f⁰⁰(ξ)

2f⁰(xn)(r − xⁿ)² What is r − xn+1? It is the n + 1 error or

n+1= |r − xⁿ⁺¹|

n= |r − xn|

So if we take the absolute value of both sides above we get

n+1= r − xn+1= f⁰⁰(ξ) 2f⁰(xn)²_n

So we are converging quadratically, but only if we are close to the root r and f⁰(r) 6= 0.

With quadratic convergence we almost double the number of significant digits after each iteration.

if f⁰(0) = 0 then we get at best linear convergence.

See the book for the analysis.

42

Newton’s Method:Example

What is f⁰(x):

f⁰(x) = 3x²+ 4x − 11

So

xn+1= xn− f (xn)

f⁰(xn)= xn−x³_n+ 2x²_n− 11xn− 12 3x²_n+ 4xn− 11

Start with x0= 30

x1 = 30 −28458

2809 = 10.13100748

x2 = 10.13100748 − 3.324046727 = 6.806960753 x3 = 6.806960753 − 2.069106062 = 4.737854691 x4 = 4.737854691 − 1.157209403 = 3.580645288 x5 = 3.580645288 − .4825214818 = 3.098123806 x6 = 3.098123806 − .09455278358 = 3.003571022

Newton’s Method:Implementation Issues

What problems do we have?

Well we must be able to evaluate f⁰(x) which isn’t always the case.

We have to evaluate f (x) and f⁰(x) for each iteration, which could be expensive.

Does not gaurantee convergence!

But it is very fast.

So when did Newton live?

(12)

Secant Method

x0

vf (x0)

x1

v x2

Lets approximate the function with a line, but this time a line between two approximations to the root called a secant line.

45

46

Secant Method

So what is x2?

Let’s equate the slopes between (x0, f (x0)), (x1, f (x1)) and (x1, f (x1)), (x2, 0):

f (x1) − f(x⁰)

x1− x⁰ =0 − f(x¹) x2− x¹ Rearrange:

x2= x1− f(x1) x1− x0

f (x1) − f(x0)

Do the same thing for x3 using x1and x2, in general:

xn+1= xn− f(xⁿ) xn− xⁿ⁻¹ f (xn) − f(xⁿ⁻¹)

This is a two point method since two approximations are needed.

Secant Method vs Newton’s Method

This is the same as the Newton method except that we approximate f⁰(x) as

f⁰(x) =f (xn) − f(xⁿ⁻¹) xn− xⁿ⁻¹

So how does this compare to Newton’s method?

How many function evaluations are needed per iteration? 1

Will it converge? Maybe

How about speed of convergence?

|n+1≈ C|n|^α α = 1/2(1 +√

5) ≈ 1.62

This isn’t quite as fast as Newton’s method but still superlinear and much faster than bisection. But if you consider number of evaluations it is faster.

(13)

Secant Method:Example

What is the secant iteration formula.

xn+1= xn− f(xⁿ) xn− xⁿ⁻¹ f (xn) − f(xⁿ⁻¹)

Lets use x0= 5 and x1= 0 f (5) = 108

f (0) = −12

x2= x1− f(x1)_{f (x}^x¹^−x⁰

1)−f (x₀)= 0 − −12−12−108)⁻¹²⁻⁵ = ¹₂ x3= x2− f(x²)_{f (x}^x₂²_{)−f (x}^−x¹ ₁₎==⁻¹⁶₁₃ = −1.2307 x4== −.99176

x4== −1.000198215

49

Secant Method:Implementation Issues

What are some issues to be concerned about during implementation? Starting values. The convergence properties depend on the starting values.

Does not gaurantee convergence!

What if we pick x0 and x1 such that f (x0) = f (x1)?

Who is this method named after?

50

Root finding comparison

Bisection Method

• Converges slowly

• Need two starting values on opposite sides of the root

• Always converges.

Newton’s Method

• Converges quickly near the root

• Needs two function evaluations per iteration.

• Need to be able to evaluate f⁰

• May not converge.

Root finding comparison

Secant Method

• Converges quickly near the root, but not as fast as Newton’s.

• Needs one function evaluation per iteration.

• Doesn’t need f⁰

• May not converge.

So what is the best method?

Well hybrids are the most popular, use bisection for so long then switch to Newton’s or Secant.

(14)

CIS 541 Roots of Polynomials

Polynomial Roots

P1(z) = anzⁿ+ an−1zⁿ⁻¹+ . . . + a0

And in factored form

P1(z) = (z − z1)ⁿ¹(z − z2)ⁿ². . . (z − zq)ⁿ^q with root zi having multiplicity ni. If an= 1 then the polynomial is monic.

If zi is a root of multiplicity ni for P1 then it is a root of multiplicity ni− 1 of P1⁰.

P1(z) = (z − zi)ⁿⁱS(z)

P₁⁰(z) = ni(z − zⁱ)ⁿⁱ⁻¹S(z) + (z − zⁱ)ⁿⁱS⁰(z)

= (z − zi)ⁿⁱ⁻¹[niS(z) + (z − zi)S⁰(z)]

= (z − zi)ⁿⁱ⁻¹T (z)

Therefore ziis of multiplicity at least ni− 1 for P1⁰. Is it more?

53

Polynomial Roots

If the multiplicity of ziwas more than ni− 1 for P1⁰

then zi must be a root of T (z) ⇒ T (zⁱ) = 0 But T (zi) = niS(zi) + (z − zⁱ)S⁰(zi) = niS(zi) 6= 0 Therefore it must be of multiplicity ni− 1

If the root (multiplicity) structure of P1is P1(z) = (z − z1)ⁿ¹(z − z2)ⁿ². . . (z − zq)ⁿ^q then P1and P₁⁰ have the common divisor

P2(z) = (z − z1)ⁿ¹⁻¹(z − z2)ⁿ²⁻¹. . . (z − zq)ⁿ^q⁻¹ and furthermore, that common divisor is a gcd.

54

Polynomial Roots

Remember the Euclidean gcd algorithm? Lets use that algorithm on these polynomials to get

P1(z) = Q1(z)P⁰(z) + R1(z) P₁⁰(z) = Q2(z)R1(z) + R2(z) R1(z) = Q3(z)R2(z) + R3(z) R2(z) = Q4(z)R3(z) + R4(z) ...

Rs−2(z) = Qs(z)Rs−1(z) + Rs(z) Rs−1(z) = Qs+1(z)Rs(z)

Then convert Rs(z) to monic form and call it P2(z) and repeat the Euclidean algorthm for gcd, getting the polynomials P1, P2, P3, . . . , Pq, 1

What happens to the multiplicity of each root as we proceed? They reduce by one.

So what is the multiplicity of root zifor polynomial Pj? ni− (j − 1) = nⁱ− j + 1

Polynomial Roots

Let’s divide P1by P2 this gives P1(z)

P2(z)= M1(z) = (z − z1)(z − z2) . . . (z − zq) Let’s define

Mi(z) =

( _P

i(z)

P_i+1(z) if i < q Pq(z) if i = q

The polynomials M1, M2, . . . , Mq all have only simple roots.

So for polynomial Mi what can we say about it’s roots? It has all roots that have at least multiplicity i of P and they are simple roots here.

(15)

Polynomial Roots

Let’s also create the polynomials Nias

Ni(z) =

( _M

i(z)

M_i+1(z) if i < q Mq(z) if i = q

The sum of the degrees of the Nis is q and each distinct root of P1(z) is a root of just one Ni and if the root is of multiplicity of m then it is a root of Nm(z).

57

Polynomial Roots:Example

P1(z) = z⁷− 3z⁵+ 3z³− z P₁⁰(z) = 7z⁶− 15z⁴+ 9z²− 1 Lets apply the Euclidean algorithm.

z⁷−3z⁵+3z³−z =1

7z(7z⁶−15z⁴+9z²−1)−6 7z⁵+12

7z³−6 7z 7z⁶−15z⁴+9z²−1 =−49

6 z(−6 7z⁵+12

7z³−6

7z)−z⁴+2z²−1

−6 7z⁵+12

7z³−6 7z =6

7z(−z⁴+ 2z²− 1) + 0 Now make the last polynomial monic we get P2(z) = z⁴− 2z²+ 1 and repeat on P2

z⁴− 2z²+ 1 =1

4z(4z³− 4z) − z²+ 1 4z³− 4z = −4z(−z²+ 1) + 0

58

Polynomial Roots:Example

Set P3(z) = z²− 1, and repeat

z²− 1 =1

2z(2z) − 1 2z = −2z(−1) + 0 So P4(z) = 1 Let find the Mi’s and Ni’s.

M1(z) = ^P_P¹^(z)

2(z)= z³− z M2(z) = ^P_P²^(z)

3(z)= z²− 1 M3(z) = P3(z) = z²− 1 N1(z) =^M_M¹₂^(z)_(z)= z N2(z) =^M_M²^(z)

3(z)= 1 N3(z) = M3(z) = z²− 1

So the roots of P1(z) are 0, 1 (with multiplicity 3) and -1 (with multiplicity 3) or

P1(z) = (z − 1)³(z + 1)³(z)

CIS 541 Synthetic Division

Polynomial Roots:Example 2

What is the factored form of the polynomial p = 2x⁹− 12x⁸+ 10x⁷+ 84x⁶− 258x⁵+ 252x⁴+

46x³− 276x²+ 200x − 48 P1(x) = x⁹− 6x⁸+ 5x⁷+ 42x⁶− 129x⁵+ 126x⁴

+23x³− 138x²+ 100x − 24 P2(x) = x⁵− 7x⁴+ 19x³− 25x²+ 16x − 4 P3(x) = x³− 4x²+ 5x − 2

P4(x) = x − 1 P5(x) = 1

M1(x) = x⁴+ x³− 7x²− x + 6 = (x − 2)(x + 3)(x − 1)(x + 1) M2(x) = x²− 3x + 2 = (x − 2)(x − 1)

M3(x) = x²− 3x + 2 = (x − 2)(x − 1) M4(x) = x − 1

M5(x) = 1

N1(x) = x²+ 4x + 3 = (x + 3)(x + 1) N2(x) = 1

N3(x) = x − 2 N4(x) = x − 1

p(x) = 2(x − 2)³(x + 3)(x − 1)⁴(x + 1)

(16)

Polynomials

What is a polynomial?

Let φ1(x), φ2(x),. . . φn(x) be a given set of functions. A polynomial p in the base functions {φi(x)}

is any function of the form:

p(x) = a1φ1(x) + a2φ2(x) + . . . + amφm(x) where the ai are constants. If the base functions are 1, x, x², . . . , xⁿthen a polynomial

p(x) = a0+ a1x + a2x²+ . . . + anxⁿ=

n

X

i=0

aixⁱ

is called an algebraic polynomial in one variable. If an6= 0 then the polynomial p is of degree n. We will consider only these types of polynomials.

61

Evaluating Polynomials

If we evaluate a polynomial naively how expensive (in terms of computions) is it?

a0

a1x 1 multiply a2xx 2 multiplies a3xxx 3 multiplies ...

anxx . . . x n multiplies

This gives 1 + 2 + 3 + . . . + n =ⁿ⁽ⁿ⁺¹⁾₂ multiplies.

Then another n additions to add them together.

Are there better ways? i.e. less calculations.

62

Evaluating Polynomials

What if we notice that xⁿ = xⁿ⁻¹x. Since we already calculated xⁿ⁻¹ why calculate it again? So now we have:

a1x 1 multiply a2xx 2 multiplies a3xx² 2 multiplies a4xx³ 2 multiplies ...

anxxⁿ⁻¹ 2 multiplies

Which is 2n − 1 multiplies and n additions. This is much cheaper than the naive method.

Are there still better ways?

Evaluating Polynomials

What if we write the polynomial in nested form:

p(x) = a0+ x(a1+ x(a2+ . . . + x(an−1+ x(an)) . . .)

Now we have n multiplications and n additions.

(17)

Synthetic Division (Horner’s Method)

Horner’s only contribution to mathematics was published in 1819, however the same method was proposed by ruffini a few years earlier. But, the method was actually published by chu in 1303, but we know it as Horner’s method because of de Morgan

If a polynomial p and a complex number x0are given then Horner’s algorithm will give the number p(x0) and the polynomial

q(x) = p(x) − p(x0) x − x0

q is of degree one less then p and by rearranging p(x) = (x − x0)q(x) + p(x0)

If we rewrite the unknown polynomial q and the known polynomial p in the following form

p(x) = a0+ a1x + a2x²+ . . . + anxⁿ q(x) = b0+ b1x + b2x²+ . . . + bn−1xⁿ⁻¹ and plug in to the above formula we get

65

Synthetic Division (Horner’s Method)

bn−1 = an

bn−2 = an−1+ x0bn−1

...

b0 = a1+ x0b1

p(x0) = a0+ x0b0

We can use this method to evaluate a polynomial quickly with n multiplies and n additions. This method is also good for doing things by hand. To do it by hand set it up the following way

an an−1 an−2 . . . a0

x0 x0bn−1 x0bn−2 . . . x0b0

bn−1 bn−2 bn−3 . . . p(x0)

66

Synthetic Division: Example

Evaluate p(x) = x⁴− 4x³+ 7x²− 5x − 2 at x = 3;

First set up the knowns:

1 −4 7 −5 −2 3

Start doing the multiplies and adds:

1 −4 7 −5 −2

3 3 ∗ 0 3 ∗ 1 1 + 0 −4 + 3 and finally

1 −4 7 −5 −2

3 3 −3 12 21

1 −1 4 7 19

so p(3) = 19 and we can write p as

p(x) = (x − 3)(x³− x²+ 4x + 7) + 19

Synthetic Division: Deflation

We can also use synthetic division for deflation, which is removing a linear factor from a polynomial. If x0is a zero of the polynomial p, then x − x⁰is a factor of p, and conversely. And the remaining zeros (roots) of p are the n − 1 roots of p(x)/(x − x0).

Example Deflate p(x) = x⁴− 4x³+ 7x²− 5x − 2 by noting that 2 is a root.

First set up the synthetic division and calculate p(2)

1 −4 7 −5 −2

2 2 −4 6 2

1 −2 3 1 0

Since p(2) = 0, we know it is indeed a root. And x⁴− 4x³+ 7x²− 5x − 2 = (x − 2)(x³− 2x²+ 3x + 1)

(18)

Synthetic Division: Derivative

Sometime we need not only to calculate p(x) but also p⁰(x) such as with Netwon’s method. Using synthetic division

p(x) = (x − x0)q(x) + p(x0)

If we use the product rule and differentiate we get p⁰(x) = q(x) + (x − x0)q⁰(x)

So p⁰(x0) is

p⁰(x0) = q(x0) + (x0− x0)q⁰(x0) = q(x0) Thus p⁰(x0) can be found by synthetic division on the polynomial q(x)

69

Synthetic Division: Derivative Example

For p(x) = 2x⁵− 7x³+ 4x − 5 find p(3) and p⁰(3).

Find p(3) by synthetic division

2 0 −7 0 4 −5

3 6 18 33 99 309

2 6 11 33 103 304 Now find p⁰(3)

2 6 11 33 103

3 6 36 141 522

2 12 47 174 625 So p⁰(3) = 625

70

Synthetic Division and Taylor expansion

Let p(x) be a polynomial and suppose we desire the coefficients ck in the equation

p(x) = anxⁿ+ an−1xⁿ⁻¹+ . . . + a0

= cn(x − x⁰)ⁿ+ cn−1(x − x⁰)ⁿ⁻¹+ . . . + c0

Using Taylor’s Theorem we know that ck=^p^(k)_k!^(x⁰⁾. This is expensive to calculate. Is there a more efficient way?

Notice that p(x0) = c0. And this coefficient (c0) can be obtained by applying synthetic division to the polynomial p, using the point x0. We also obtain the polynomial q(x) such that

q(x) = p(x) − p(x⁰) x − x⁰

Synthetic Division and Taylor expansion

q(x) = p(x) − p(x0) x − x0

= cn(x − x0)ⁿ+ cn−1(x − x0)ⁿ⁻¹+ . . . + c0

x − x0 −p(x0)

x − x0

= cn(x−x0)ⁿ⁻¹+cn−1(x−x0)ⁿ⁻²+. . .+c1+ c0

x − x0− c0

x − x0

= cn(x − x0)ⁿ⁻¹+ cn−1(x − x0)ⁿ⁻²+ . . . + c1

This shows that the coefficient c1can be obtained by applying synthetic division to the polynomial q with point x0 because c1= q(x0). We repeat the process until all coefficients ck are found.

(19)

Synthetic Division: Taylor Expansion:Example

Find the Taylor expansion of the polynomial p about the point 3.

p(x) = x⁴− 4x³+ 7x²− 5x − 2

1 −4 7 −5 −2

3 3 −3 12 21

1 −1 4 7 19

3 3 6 30

1 2 10 37

3 3 15

1 5 25

3 3

1 8

So

p(x) = (x−3)⁴+8(x−3)³+25(x−3)²+37(x−3)+19

73

Polynomials: Localizing zeros

Therorem: All zeros of the polynomial p(x) = anxⁿ+ an−1xⁿ⁻¹+ . . . + a0lie in the open disk whose center is at the origin of the complex plane and whose radius is

ρ = 1 + |an|⁻¹ max

0≤k≤n|ak|

Example: Find a disk centered at the origin that contains all the zeros of the polynomial

p(x) = x⁴− 4x³+ 7x²− 5x − 2

One such disk would be ρ = 1 + |a⁴|⁻¹ max

0≤k≤n|a^k| = 1 +7 1 = 8

74

Polynomials: Localizing zeros

Take our polynomial p(x) = anxⁿ+ an−1xⁿ⁻¹+ . . . + a0 and consider the function s(x) = xⁿp(1/x).

This gives

s(x) = xⁿ

"

an

1 x

n

+ an−1

1 x

n−1

+ . . . + a0

#

= an+ an−1x + . . . + a0xⁿ

s is a polynomial of degree at most n. It has the same coefficients as p except in reverse order. Also for a nonzero complex number z the condition p(z) = 0 is equivalent to s(1/z) = 0. Thus

Therorem:If all the zeros of s are in the disk{z :

|z| ≤ ρ}, then all the nonzero roots (r is a root and r 6= 0) of p are outside of the disk{z : |z| < ρ⁻¹}

Polynomials: Localizing zeros

Example: Find a disk centered at the origin that contains no zeros of p, where

p(x) = x⁴− 4x³+ 7x²− 5x − 2

Using the above thereom we get

s(x) = −2x⁴− 5x³+ 7x²− 4x + 1

and all zeros of s lie in a disk centered at the origin with a radius of

ρ = 1 + |a⁴|⁻¹ max

0≤k≤n|a^k| =9 2

Therefore, the zeros of p lie outside the disk of radius²₉ and furthermore all zeros of p lie in the ring²₉< |z| < 8 in the complex plane.

(20)

CIS 541 Systems of Linear Equations

Systems of Linear Equations

We want to solve a system of n linear equations in n unknowns such as











a11x1+ a12x2+ a13x3+ . . . + a1nxn= b1

a21x1+ a22x2+ a23x3+ . . . + a2nxn= b2

a31x1+ a32x2+ a33x3+ . . . + a3nxn= b3

...

ai1x1+ ai2x2+ ai3x3+ . . . + ainxn= bi

...

an1x1+ an2x2+ an3x3+ . . . + annxn= bn

We can write this system in compact form as

n

X

j=1

aijxj= bi (1 ≤ i ≤ n)

where the aij’s and bi’s are real numbers and the unknowns are the xj’s

77

Systems of Linear Equations: Example

Solve the following system of equations for the unknowns:











6x1− 2x2+ 2x3+ 4x4= 16 12x1− 8x2+ 6x3+ 10x4= 26 3x1− 13x2+ 9x3+ 3x4= −19

−6x¹+ 4x2+ x3− 18x⁴= −34 How do we solve this?

We could solve for x1in one of the equations and substitute into the other 3, then we are down to three unknown and three equations. Then keep doing it.

And for only 4 unknowns, that may not be bad. What about if we had 50 unknowns?

What if we got rid of one of the unknowns, say x1

in all but one equation?

How can we do this?

How about making the coefficients 0 for x1 by adding or subtracting equations?

78

Systems of Linear Equations: Example











6x1− 2x2+ 2x3+ 4x4= 16 12x1− 8x2+ 6x3+ 10x4= 26 3x1− 13x²+ 9x3+ 3x4= −19

−6x1+ 4x2+ x3− 18x4= −34 So given our system of equations, how can we remove the coefficient for x1in the second equation?

What about subtracting twice the first equation from it?

12x1− 8x2+ 6x3+ 10x4= 26

− 2(6x1− 2x2+ 2x3+ 4x4) = 2(16)

12x1− 8x2+ 6x3+ 10x4= 26 + −12x¹+ 4x2− 4x³− 8x⁴) = −32 0x1− 4x2+ 2x3+ 2x4= −6 Where did we get the 2 from? Isn’t it just ¹²₆?

Systems of Linear Equations: Example











6x1− 2x2+ 2x3+ 4x4= 16 12x1− 8x2+ 6x3+ 10x4= 26 3x1− 13x²+ 9x3+ 3x4= −19

−6x1+ 4x2+ x3− 18x4= −34 What about the third row?

That’s right if we multiply the first row by ³₆ = ¹₂ and subtract, and for the fourth row it would be

−6 6 = −1

After the subtractions we end up with:











6x1− 2x2+ 2x3+ 4x4= 16

− 4x2+ 2x3+ 2x4= −6

− 12x2+ 8x3+ x4= −27 2x2+ 3x3− 14x⁴= −18 The first equation was only used, and not altered. In this context we call it the pivot equation.