Errors

In most computational processes, perfect accuracy is impossible. We must make certain approximations, and this introduces errors. There are four basic types of errors:

1.
Model errors. Our mathematical model of a real-world situation is almost never completely exact; almost always, it ignores some aspects which we hope are relatively unimportant, e.g. air resistance in the motion of a falling body.
2.
Measurement errors. Even the parts of the real-world situation that are adequately modelled usually involve numerical parameters that come from measurements, and these measurements are not perfectly accurate.
3.
Round-off errors. A computer can only work with a finite number of decimal places. The results of its arithmetic operations are only approximations to the true results.
4.
Discretization errors, also called Truncation errors. Our formulas themselves may only be approximations to the true values, and thus would not produce correct answers even if there were no errors of the first three types.

I don't have much to say about the first two types of error. The mathematician often tends to ignore them, being concerned only with solving a given equation, not in how well the results apply to the real world. I'll just mention that, in some situations where round-off and discretization errors seriously affect the accuracy of our results, we don't have to feel so bad: a problem so sensitive to these errors is probably also sensitive to the first two types, so that even with perfect arithmetic and an exact formula it would be impossible to get truly accurate results.


Roundoff errors


``Real'' numbers are generally represented in computers using ``floating-point'' arithmetic, which is similar to so-called ``scientific notation''. If a base-10 system was used, you would write a number in the form $r \times 10^k$, where k is an integer and $0.1 \le r < 1$. Only a certain number n of digits of r could be stored. Suppose you used this type of system with n = 4, and you wanted to multiply $\pi$ (represented as $.3142 \times 10^1$) by 76.54 ($.7654 \times 10^2$). You would round $.3142 \times .7654 = .24048868$ to .2405 to get a result of 240.5 ($.2405 \times 10^3$). Actually, nearly all computers these days use a base-2 system rather than base-10, but the idea is the same. The number of digits of accuracy has to do with relative error rather than absolute error (the absolute error is the magnitude of the difference between the true value and the approximation, while the relative error is the absolute error divided by the magnitude of the true value).

Multiplication and division do not cause too many problems of accuracy, because the product or quotient of two floating-point numbers will have almost as many digits of accuracy as one of those numbers. For addition and subtraction, on the other hand, there are two important problems to note. The first is that when two numbers of very different magnitudes are added, many of the digits of the smaller one will have no effect on the result. In fact, if x is sufficiently close to 0, the computer will not be able to distinguish between 1 + x and 1. This provides a convenient way to gauge the accuracy of floating-point arithmetic in a computer: the ``machine accuracy'' $\epsilon_m$ is defined as the smallest positive x such that 1 + x is distinguishable from 1. It also means that calculating a sum of very many, very small quantities could lead to very inaccurate results. Unfortunately, this is exactly what we want to do in solving differential equations numerically. More about this later.

The second problem is that subtraction of two nearly equal quantities will result in a large relative error. For example, subtracting 4.321 from 4.322 (both numbers with four significant digits) produces 0.001 with only one significant digit. One place where this can cause unexpected problems is in the standard formula for solving a quadratic equation:

\begin{displaymath}
r_\pm = \frac{-b \pm \sqrt{b^2 - 4 a c}}{2a} \end{displaymath}

Suppose b is positive, and large compared to a and c. Then $\sqrt{b^2 -
4 a c}$ will be nearly b. Although it's fine for $\displaystyle r_-$, the formula can produce a very inaccurate $\displaystyle r_+$. Try it yourself on your calculator, say with a = c = 1 and $\displaystyle b = 10^4,\ 10^6,\ 10^8$ (the correct results would be approximately 1.00000001E-4, 1.0000000E-6, and 1.0000000E-8 respectively). A better way would be to use

\begin{displaymath}
r_+ = \frac{c}{a r_-}\end{displaymath}

In most cases, neither the user nor the programmer has much control over rounding error, since floating-point arithmetic is built into the machine or the programming language. The best we can do is to specify ``double precision'' when that is available. For example, in Turbo Pascal on a PC, any serious numerical work requires at least the ``Double'' type of real, with $\displaystyle\epsilon_m \approx 1.11 \times 10^{-16}$. This is what was used in the programs MG and DIFF. It is rather better than the average pocket calculator, which might have $\displaystyle\epsilon_m \approx 10^{-12}$. On the other hand, even very powerful mainframe computers often have single precision real arithmetic with $\displaystyle\epsilon_m \gt 10^{-8}$, which would be quite unacceptable.


Discretization error


While there is not much programmers can do about the previous types of error, discretization error is entirely under their control, and most of the subject of numerical analysis deals with ways to control it. Usually, this kind of error results from using an approximate formula. The formula might be valid in the limit as some parameter $h \to 0$, but we have to use it with finite h. For example, you may recall that

\begin{displaymath}
\exp(x) = \lim_{h \to 0} (1 + x h)^{1/h}
\end{displaymath}

This might suggest using $\displaystyle(1 + x h)^{1/h}$ as an approximation to $\exp(x)$, with h some small positive number (perhaps the reciprocal of an integer). The difference $\displaystyle\exp(x) - (1 + x h)^{1/h}$ would be the discretization error.



 

Robert Israel
9/18/2000