How Do the Roots of a Polynomial Depend on the Coefficients?


Posted

The literature is full of methods for finding the roots of polynomials; however, few if none ask how well the roots can be determined, even with a perfect root-finding method. Since the coefficients of the polynomial are most likely stored as floating-point numbers, the floating-point resolution leads to an absolute bound on the accuracy of the roots. Uncertainty of the inputs to the root finding lead to uncertainty in the outputs.

Assume that we have already found a root z of a polynomial P(x), so that

k P k z k = 0 . SUM from k P sub k z sup k = 0 "."

(1)

Rearrange this to pull out the nth polynomial coefficient,

P n = k n P k z k n . P sub n = - SUM from { k <> n } P sub k z sup { k - n } "."

(2)

Now take the derivative with respect to z,

dP n dz = k n ( k n ) P k z k n 1 . { dP sub n } over dz = - SUM from { k <> n } (k-n) P sub k z sup { k - n - 1 } "."

(3)

Note that since (k-n) is zero when k=n, we can add back the nth term into the summation, giving

dP n dz = k ( k n ) P k z k n 1 . { dP sub n } over dz = - SUM from k (k-n) P sub k z sup { k - n - 1 } "."

(4)

The monomials with n as a constant in (k-n) sum up to the original polynomial, which is zero. Pulling out the common zn term gives,

dP n dz = z n k k P k z k 1 . { dP sub n } over dz = - z sup -n SUM from k k P sub k z sup { k - 1 } "."

(5)

The sum is just the derivative of the entire polynomial, so we get the nice compact result that

dP n dz = z n dP dz . { dP sub n } over dz = - z sup -n dP over dz "."

(6)

Flipping over the derivative gives the change of the root with respect to the nth polynomial term,

dz dP n = z n ( dP dz ) 1 . dz over { dP sub n } = -z sup n left ( dP over dz right ) sup -1 "."

(7)

Multiple Roots

Note that this expression is invalid when the derivative of the polynomial at z is zero. This is exactly the case where the polynomial has multiple roots at z. There are two distinct cases for multiple roots. The first is when the roots are fundamentally independent and just happen to coincide. One example of this is from optical raytracing, where a ray can just graze a surface. For this sort of polynomial, the above result is correct; the polynomial is unstable with respect to the coefficients, and the slightest change can alter the root from real to complex.

The alternative is when the roots are structurally multiple. An example of this is when the polynomial was created as the square of another polynomial. In this case, the dependence of the coefficients can be found by looking at the derivative. Define a new polynomial Q(z):

Q ( z ) = P k P 0 k z k 1 Q(z) = SUM { {P sub k} over {P sub 0} } k z sup {k-1}

(8)

If the root in question is double, then the sensitivity can be found by using Q(z) in place of P(z) in equation (7). For higher multiplicities, just repeat the derivative process on Q(z).

Numerical Analysis

Now define δ as the relative distance between two adjacent floating point numbers. In other words, if a real number is rounded to the nearest floating point number N, N would collect up all the numbers in the range [ (1-δ/2)N,(1+δ/2)N]. The maximum error in a polynomial term is then

E max , k = δ 2 | P k | E sub {max,k} = %delta over 2 abs { P sub k }

(9)

and the rms error, assuming that the errors are uniformly distributed, is

E rms , k = δ 12 | P k | . E sub {rms,k} = %delta over sqrt 12 abs { P sub k } "."

(10)

Ignoring multiple roots, the maximum error is then the sum of the individual errors for each term, or

E max ( z ) = k | dz dP k | δ 2 | P k | . E sub max (z) = SUM from k abs { { dz over { dP sub k } } } %delta over 2 abs { P sub k } "."

(11)

Pulling out common terms gives a nice, compact expression for the maximum error in the root due to the floating point accuracy of the polynomial coefficients,

E max ( z ) = δ 2 | dP dz | 1 k | P k z k | . E sub max (z) = %delta over 2 { abs { dP over dz } } sup -1 SUM from k abs { P sub k z sup k } "."

(12)

The rms or root-mean-square error is

E rms ( z ) = k ( dz dP k δ 12 P k ) 2 . E sub rms (z) = sqrt { SUM from k left ( dz over { dP sub k } %delta over { sqrt { 12 } } P sub k right ) sup 2 } "."

(13)

Pulling out common terms gives

E rms ( z ) = δ 12 | dP dz | 1 k P k 2 z 2 k . E sub rms (z) = %delta over { sqrt 12 } { abs { dP over dz } } sup -1 sqrt { SUM from k P sub k sup 2 z sup 2k } "."

(14)

These expression do not consider multiple roots; however, these expressions do give us a means to sensibly determine multiplicity.

Two roots, z1 and z2, cannot be a pair of multiple roots if

| z 1 z 2 | > E max ( z 1 ) + E max ( z 2 ) . abs { z sub 1 - z sub 2 } > E sub max ( z sub 1 ) + E sub max ( z sub 2 ) "."

(15)

If, on the other hand, (15) does not hold, then the two roots could possibly coincide. Similarly, if

| z 1 z 2 | 2 < C E rms 2 ( z 1 ) + E rms 2 ( z 2 ) { abs { z sub 1 - z sub 2 } } sup 2 < C sqrt { E sub rms sup 2( z sub 1 ) + E sub rms sup 2 ( z sub 2 ) }

(16)

holds for some constant C near 1, then the two roots quite likely coincide. Different choices for C correspond to different likelihoods of root coincidence.


Categories Numerical Analysis

← Older Newer →