How Do the Roots of a Polynomial Depend on the Coefficients?

Posted

The literature is full of methods for finding the roots of polynomials; however, few if none ask how well the roots can be determined, even with a perfect root-finding method. Since the coefficients of the polynomial are most likely stored as floating-point numbers, the floating-point resolution leads to an absolute bound on the accuracy of the roots. Uncertainty of the inputs to the root finding lead to uncertainty in the outputs.

Assume that we have already found a root z of a polynomial P(x), so that

$\sum _{k}{P}_{k}{z}^{k}=0\text{.}$

(1)

Rearrange this to pull out the nth polynomial coefficient,

${P}_{n}=-\sum _{k\ne n}{P}_{k}{z}^{k-n}\text{.}$

(2)

Now take the derivative with respect to z,

$\frac{{\mathit{dP}}_{n}}{\mathit{dz}}=-\sum _{k\ne n}\left(k-n\right){P}_{k}{z}^{k-n-1}\text{.}$

(3)

Note that since (k-n) is zero when k=n, we can add back the nth term into the summation, giving

$\frac{{\mathit{dP}}_{n}}{\mathit{dz}}=-\sum _{k}\left(k-n\right){P}_{k}{z}^{k-n-1}\text{.}$

(4)

The monomials with n as a constant in (k-n) sum up to the original polynomial, which is zero. Pulling out the common zn term gives,

$\frac{{\mathit{dP}}_{n}}{\mathit{dz}}=-{z}^{-n}\sum _{k}k{P}_{k}{z}^{k-1}\text{.}$

(5)

The sum is just the derivative of the entire polynomial, so we get the nice compact result that

$\frac{{\mathit{dP}}_{n}}{\mathit{dz}}=-{z}^{-n}\frac{\mathit{dP}}{\mathit{dz}}\text{.}$

(6)

Flipping over the derivative gives the change of the root with respect to the nth polynomial term,

$\frac{\mathit{dz}}{{\mathit{dP}}_{n}}=-{z}^{n}{\left(\frac{\mathit{dP}}{\mathit{dz}}\right)}^{-1}\text{.}$

(7)

Multiple Roots

Note that this expression is invalid when the derivative of the polynomial at z is zero. This is exactly the case where the polynomial has multiple roots at z. There are two distinct cases for multiple roots. The first is when the roots are fundamentally independent and just happen to coincide. One example of this is from optical raytracing, where a ray can just graze a surface. For this sort of polynomial, the above result is correct; the polynomial is unstable with respect to the coefficients, and the slightest change can alter the root from real to complex.

The alternative is when the roots are structurally multiple. An example of this is when the polynomial was created as the square of another polynomial. In this case, the dependence of the coefficients can be found by looking at the derivative. Define a new polynomial Q(z):

$Q\left(z\right)=\sum \frac{{P}_{k}}{{P}_{0}}k{z}^{k-1}$

(8)

If the root in question is double, then the sensitivity can be found by using Q(z) in place of P(z) in equation (7). For higher multiplicities, just repeat the derivative process on Q(z).

Numerical Analysis

Now define δ as the relative distance between two adjacent floating point numbers. In other words, if a real number is rounded to the nearest floating point number N, N would collect up all the numbers in the range [ (1-δ/2)N,(1+δ/2)N]. The maximum error in a polynomial term is then

${E}_{\mathit{max},k}=\frac{\mathrm{\delta }}{2}|{P}_{k}|$

(9)

and the rms error, assuming that the errors are uniformly distributed, is

${E}_{\mathit{rms},k}=\frac{\mathrm{\delta }}{\sqrt{12}}|{P}_{k}|\text{.}$

(10)

Ignoring multiple roots, the maximum error is then the sum of the individual errors for each term, or

${E}_{\mathit{max}}\left(z\right)=\sum _{k}|\frac{\mathit{dz}}{{\mathit{dP}}_{k}}|\frac{\mathrm{\delta }}{2}|{P}_{k}|\text{.}$

(11)

Pulling out common terms gives a nice, compact expression for the maximum error in the root due to the floating point accuracy of the polynomial coefficients,

${E}_{\mathit{max}}\left(z\right)=\frac{\mathrm{\delta }}{2}{|\frac{\mathit{dP}}{\mathit{dz}}|}^{-1}\sum _{k}|{P}_{k}{z}^{k}|\text{.}$

(12)

The rms or root-mean-square error is

${E}_{\mathit{rms}}\left(z\right)=\sqrt{\sum _{k}{\left(\frac{\mathit{dz}}{{\mathit{dP}}_{k}}\frac{\mathrm{\delta }}{\sqrt{12}}{P}_{k}\right)}^{2}}\text{.}$

(13)

Pulling out common terms gives

${E}_{\mathit{rms}}\left(z\right)=\frac{\mathrm{\delta }}{\sqrt{12}}{|\frac{\mathit{dP}}{\mathit{dz}}|}^{-1}\sqrt{\sum _{k}{P}_{k}^{2}{z}^{2k}}\text{.}$

(14)

These expression do not consider multiple roots; however, these expressions do give us a means to sensibly determine multiplicity.

Two roots, z1 and z2, cannot be a pair of multiple roots if

$|{z}_{1}-{z}_{2}|>{E}_{\mathit{max}}\left({z}_{1}\right)+{E}_{\mathit{max}}\left({z}_{2}\right)\text{.}$

(15)

If, on the other hand, (15) does not hold, then the two roots could possibly coincide. Similarly, if

${|{z}_{1}-{z}_{2}|}^{2}

(16)

holds for some constant C near 1, then the two roots quite likely coincide. Different choices for C correspond to different likelihoods of root coincidence.

Categories Numerical Analysis