Science and engineering depend upon computation of functions such as flow fields, charge distributions, and quantum states. Ultimately, such computations require some kind of discretization, but in recent years, it has become possible in many cases to hide the discretizations from the user. We present the Chebfun system for numerical computation with functions, which is based on a key idea: an analogy of floating-point arithmetic for functions rather than numbers.
1. Introduction
The oldest problem of computing is, how can we calculate mathematical quantities? As other aspects of computing have entered into every corner of our lives, mathematical computation has become a less conspicuous part of computer science, but it has not gone away. On the contrary, it is bigger than ever, the basis of much of science and engineering.
The mathematical objects of interest in science and engineering are not just individual numbers but functions. To make weather predictions, we simulate velocity, pressure, and temperature distributions, which are multidimensional functions evolving in time. To design electronic devices, we compute electric and magnetic fields, which are also functions. Sometimes the physics of a problem is described by long-established differential equations such as the Maxwell or Schrödinger equations, but just because the equations are understood does not mean the problem is finished. It may still be a great challenge to solve the equations.
How do we calculate functions? The almost unavoidable answer is that they must be discretized in one way or another, so that derivatives, for example, may be replaced by finite differences. Numerical analysts and computational engineers are the experts at handling these discretizations.
As computers grow more powerful, however, a new possibility has come into play: hiding the discretizations away so that the scientist does not have to see them. This is not feasible yet for weather prediction, but for certain kinds of desktop computing, it is becoming a reality. This paper introduces the Chebfun software system, which has followed this vision from its inception in 2002. For functions of one variable, f (x), the aim has been largely achieved, and progress is well underway for functions of two variables, f (x, y).
Chebfun is built on an analogy. To work with real numbers on a computer, we typically approximate them to 16 digits by finite bit strings: floating-point numbers, with an associated concept of rounding at each step of a calculation. To work with functions, Chebfun approximates them to 16 digits by polynomials (or piecewise polynomials) of finite degree: Chebsyhev expansions, again with an associated concept of rounding. Thus the key to numerical computation with functions is the generalization of the ideas of floating-point approximation and rounding from numbers to functions.
2. A Combinatorial Explosion
Have not discretizations in general, and floating-point numbers in particular, been rendered superfluous by the introduction of symbolic systems like Mathematica or Maple? It is worth taking a moment to explain why the answer is no, for this will help elucidate the basis of our algorithms for numerical computing with functions.
We begin with what looks like an encouraging observation: if x and y are rational numbers, then so are x + y, x − y, xy, and x/y (assuming y ≠ 0). Since rational numbers can readily be represented on computers, this might seem to suggest that there is no need for floating-point arithmetic with its inexact process of rounding. If a computer works in rational arithmetic, no error is ever made, so it might seem that, in principle, much of numerical computation could be carried out exactly.
The first obstacle we encounter is that not every interesting real number x is rational (think of the hypotenuse of a triangle). However, this alone is not a serious problem, as x can be approximated arbitrarily closely by rationals.
The bigger problem is that when we try to construct such approximations by practical algorithms, we run into combinatorial or exponential explosions. For example, suppose we wish to find a root of the polynomial
We can approximate an answer to great accuracy by rational numbers if we take a few steps of Newton’s method, taught in any introductory numerical analysis course. Let us do this, beginning from the initial guess x(0) = 0. The startling result is shown in Table 1.
There is a problem here! As approximations to an exact root of p, the rational numbers displayed in the table are accurate to approximately 0, 0, 1, 3, 6, and 12 digits, respectively; the number of useful digits doubles at each step thanks to the quadratic convergence of Newton’s method. Yet the lengths of the numerators are 1, 1, 2, 10, 53, and 265 digits, expanding by a factor of about 5 at each step since the degree of p is 5. After three more steps, we will have an answer x(8) accurate to 100 digits, but represented by numerator and denominator each about 33,125 digits long, and storing it will require 66 kB. If we were so foolish as to try to take 20 steps of Newton’s method in this mode, we would need 16 TB to store the result.
Such difficulties are ubiquitous. Rational computations, and symbolic computations in general, have a way of expanding exponentially. If nothing is done to counter this effect, computations grind to a halt because of excessive demands on computing time and memory. This is ultimately the reason why symbolic computing, though powerful when it works, plays such a circumscribed role in computational science. As an example with more of a flavor of functions rather than numbers, suppose we want to know the indefinite integral of the function
This happens to be a function that can be integrated analytically, but the result is not simple. The Wolfram Mathematica Online Integrator produces an answer that consists of the expression
plus 20 other terms of similar form, with denominators ranging from 512 to 3,687,424. Working with such expressions is unwieldy when it is possible at all. An indication of their curious status is that if I wanted to be confident that this long formula was right, the first thing I would do would be to see if it matched results from a numerical computation.
3. Floating-Point Arithmetic
It is in the light of such examples that I would like to consider the standard alternative to rational arithmetic, namely floating-point arithmetic. As is well known, this is the idea of representing numbers on computers by, for example, 64-bit binary words containing 53 bits (≈16 digits) for a fraction and 11 for an exponent. (These parameters correspond to the IEEE double precision standard.) Konrad Zuse invented floating-point arithmetic in Germany before World War II, and the idea was developed by IBM and other manufacturers a few years later. The IEEE standardization came in the mid-1980s and is beautifully summarized in the book by Overton.15 For more up-to-date details, see Muller et al.14
There are two aspects to floating-point technology: a representation of real (and complex) numbers via a subset of the rationals and a prescription for rounded arithmetic. These principles combine to halt the combinatorial explosion. Thus, for example, if two 53-bit numbers are multiplied, the mathematically exact result would require about 106 bits to be represented. Instead of accepting this, we round the result down to 53 bits again. More generally, most floating-point arithmetic systems adhere to the following principle: when an operation +, −, ×, / is performed on two floating-point numbers, the output is the exactly correct result rounded to the nearest floating-point number, with ties broken by a well-defined rule. This implies that every floating-point operation is exact except for a small relative error:
Here ∗ denotes one of the operations +, −, ×, /, and we are ignoring the possibilities of underflow or overflow. The IEEE double precision value of “machine epsilon” is εmach = 2−53 ≈ 1.1 × 10−16.
Equation (1) implies an important corollary: when two floating-point numbers x and y are combined on the computer by an operation ∗, the result computed (x ∗ y) is exactly equal to ∗ for some two numbers and that are close to x and y in a relative sense:
Numerical analysts say that the operations +, −, ×, / are backward stable, delivering the exactly correct results for inputs that are slightly perturbed from their correct values in a relative sense. The same conclusion holds or nearly holds for good implementations of other elementary operations, often unary instead of binary, such as √, exp, or sin.14
Floating-point arithmetic is not widely regarded as one of computer science’s sexier topics. A common view is that it is an ugly but necessary engineering compromise. We cannot do arithmetic honestly, the idea goes, so we cheat a bit—unfortunate, but unavoidable, or as some have called it, a “Faustian bargain.” In abandoning exact computation, we sell our souls, and in return, we get some numbers.
I think one can take a more positive view. Floating-point arithmetic is an algorithm, no less than a general procedure for containing the combinatorial explosion. Consider the Newton iteration of Table 1 again, but now carried out in IEEE 16-digit arithmetic:
It is the same process as before, less startling without the exponential explosion, but far more useful. Of course, though these numbers are printed in decimal, what is really going on in the computer is binary. The exact value at the end, for example, is not the decimal number printed but
Abstractly speaking, when we compute with rational numbers, we might proceed like this:
Compute an exact result,
then round it to a certain number of bits.
The problem is that the exact result is often exponentially lengthy. Floating-point arithmetic represents an alternative idea:
Round the computation at every step,
not just at the end.
This strategy has proved spectacularly successful. At a stroke, combinatorial explosion ceases to be an issue. Moreover, so long as the computation is not numerically unstable in a sense understood thoroughly by numerical analysts, the final result will be accurate. This is what one observes in practice, and it is also the rigorous conclusion of theoretical analysis of thousands of algorithms investigated by generations of numerical analysts.12
4. Chebfun
Chebfun is an open-source software system developed over the past decade at Oxford by myself and a succession of students and postdocs including Zachary Battles, Ásgeir Birkisson, Nick Hale, and Alex Townsend, as well as Toby Driscoll at the University of Delaware (a full list can be found in the Acknowledgments and at www.chebfun.org). The aim of Chebfun is to extend the ideas we have just discussed from numbers to functions. Specifically, Chebfun works with piecewise smooth real or complex functions defined on an interval [a, b], which by default is [–1, 1]. A function is represented by an object known as a chebfun. (We write “Chebfun” as the name of the system and “chebfun” for the representation of an individual function.) If f and g are chebfuns, we can perform operations on them such as +, −, ×, /, as well as other operations like exp or sin. The intention is not that such computations will be exact. Instead, the aim is to achieve an analogue of Equation (2) for functions,
(again ignoring underflow and overflow), where C is a small constant, with a similar property for unary operations. Here ||·|| is a suitable norm such as ||·||. Thus the aim of Chebfun is normwise backward stable computation of functions. We shall say more about the significance of (3) in Section 6.
Chebfun is implemented in MATLAB, a language whose object-oriented capabilities enable one to overload operations such as +, −, ×, /, sin, and exp with appropriate alternatives. Some of the methods defined for chebfuns are as follows (this list is about one-third of the total):
MATLAB (or Python) programmers will recognize many of these as standard commands. In MATLAB, such commands apply to discrete vectors, or sometimes matrices, but in Chebfun, they perform continuous analogues of the operations on chebfuns. Thus, for example, log(f)
and sinh(f)
deliver the logarithm and the hyperbolic sine of a chebfun f
, respectively. More interestingly, sum(f)
produces the definite integral of f
from a to b (a scalar), the analogue for continuous functions of the sum of entries of a vector. Similarly, cumsum(f)
produces the indefinite integral of f
(a chebfun), diff(f)
computes the derivative (another chebfun), and roots(f)
finds the roots in the interval [a, b] (a vector of length equal to the number of roots).
Mathematically, the basis of Chebfun—and the origin of its name—is piecewise Chebyshev expansions. Let Tj denote the Chebyshev polynomial Tj(x) = cos(j cos−1 x), of degree j, which equioscillates between j + 1 extrema ±1 on [−1, 1]. The Chebyshev series for any Hölder continuous f ∈ C[−1, 1] is defined by22
where the prime indicates that the term with j = 0 is multiplied by 1/2. (These formulas can be derived using the change of variables x = cos θ from the Fourier series for the 2π-periodic even function f(cos θ). Chebyshev series are essentially the same as Fourier series, but for nonperiodic functions.) Chebfun is based on storing and manipulating coefficients {aj} for such expansions. Many of the algorithms make use of the equivalent information of samples f(xj) at Chebyshev points,
and one can go back and forth to the representation of Equation (4) as needed by means of the Fast Fourier Transform (FFT). Each chebfun has a fixed finite n chosen to be large enough for the representation, according to our best estimate, to be accurate in the local sense (Equation (3)) to 16 digits. Given data fj = f (xj) at the Chebyshev points (Equation (5)), other values can be determined by the barycentric interpolation formula,18
where the weights {wj} are defined by
(If x happens to be exactly equal to some xj, one bypasses Equation (6) and sets f (x) = f (xj).) This method is known to be numerically stable, even for polynomial interpolation in millions of points.13
If f is analytic on [–1, 1], its Chebsyhev coefficients {aj} decrease exponentially.22 If f is not analytic but still several times differentiable, they decrease at an algebraic rate determined by the number of derivatives. It is these properties of rapid convergence that Chebfun exploits to be a practical computational tool. Suppose a chebfun is to be constructed, for example, by the statement
What happens when this command is executed is that the system performs adaptive calculations to determine what degree of polynomial approximation is needed to represent sin(x) to about 15 digits of accuracy. The answer in this case turns out to be 13, so that our 15-digit approximation is actually
when represented in the well-behaved basis of Chebyshev polynomials {Tk}, or
in the badly behaved but more familiar basis of monomials. This is a rather short chebfun; more typically, the length might be 50 or 200. For example, chebfun(@(x) sin(50∗x))
has length 90, and chebfun(@(x) exp(−1./x.∧2))
has length 219.
Having settled on representing functions by Chebyshev expansions and interpolants, we next face the question of how to implement mathematical operations such as those in the list above. This is a very interesting matter, and details of the many algorithms used in Chebfun can be found in Trefethen22 and the other references. For example, zeros of chebfuns are found by roots
by a recursive subdivision of the interval combined with eigenvalue computations for Chebyshev “colleague matrices,”4 and global maxima and minima are determined by max
and min
by first finding zeros of the derivative. All these computations are fast and accurate even when the underlying polynomial representations have degrees in the thousands.
At the end of Section 2, we considered an indefinite integral. In Chebfun indefinite integration is carried out by the command cumsum
, as mentioned above, and that example on the interval [–1, 1] could go like this:
The chebfun g is produced in about 0.02 s on a desktop machine, a polynomial of degree 94 accurate to about 16 digits. Here is a plot:
5. Taming The Explosion
As mentioned earlier, when two 53-bit numbers are multiplied, an exact result would normally require 106 bits, but floating-point arithmetic rounds this to 53. Chebfun implements an analogous compression for polynomial approximations of functions as opposed to binary approximations of numbers. For example, suppose x is the chebfun corresponding to the linear function x on [–1, 1]. If we execute the commands
we find that the chebfuns f
and g
have degrees 13 and 14, respectively. One might expect their product to have degree 27, but in fact, h
has degree only 17. This happens because at every step, the system automatically discards Chebyshev coefficients that are below machine precision—just as floating-point arithmetic discards bits below the 53rd. The degree grows only as the complexity of the functions involved genuinely grows, as measured on the scale of machine epsilon.
Here is an example to illustrate how this process contains the explosion of polynomial degrees. The program
begins by constructing a chebfun f
corresponding to the function sin(πx) on the interval [–1, 1], with degree 19. Then it takes 15 steps of an iteration that raises the current f
to the fourth power at each step. The result after a fraction of a second on a desktop computer is a rather complicated chebfun, of degree 3378, which looks like this:
The degree 3378 may seem high, but it is very low compared to what it would be if the fourth powers were computed without dropping small coefficients, namely 19 × 415 = 20,401,094,656! Thus the complexity has been curtailed by a factor of millions, yet with little loss of accuracy. For example, the command roots(s-8)
now takes less than a second to compute the 12 points x ∈ [–1, 1] with s(x) = 8:
These results are all correct except in the last digit.
Once one has a chebfun representation, further computations are easy. For example, sum(s)
returns the definite integral 15.26548382582674
in a few thousands of a second. The exact value is 15.26548382582674700943…
6. Normwise Backward Stability
Does Chebfun live up to the vision of an analogue for functions of floating-point arithmetic for numbers? While considering this question, a good starting point is the normwise backward stability condition Equation (3), and in particular, it is productive to focus on two questions:
- How close does Chebfun come to achieving Equation (3)?
- What are the implications of this condition?
The answer to (I) appears to be that Chebfun does satisfy Equation (3), at least for the basic operations +, −, ×, /. This has not been proved formally, and it is a project for the future to develop a rigorous theory. To explain how Equation (3) can hold, let us consider the mode in which each chebfun is represented precisely by a finite Chebyshev series with floating-point coefficients (instead of values at Chebyshev points). The property of Equation (3) for + and − stems from the corresponding properties for addition and subtraction of floating-point numbers, together with the numerical stability of barycentric interpolation.13 For multiplication, the argument is only slightly more complicated, since again the operation comes down to one of Chebyshev coefficients. The more challenging fundamental operation is division, for this case, the quotient f/g is sampled pointwise at various Chebyshev points and then a new Chebyshev series is constructed by the adaptive process used generally for chebfun construction. It is not known whether the current code contains safeguards enough to give a guarantee of Equation (3), and this is a subject for investigation. In addition, it will be necessary to consider analogues of Equation (3) for other Chebfun operations besides +, −, ×, /.
This brings us to (II), the question of the implications of Equation (3). The easier part of the answer, at least for numerical analysts familiar with backward error analysis, is to understand exactly what the property of Equation (3) does and does not assert about numerical accuracy. A crucial fact is that the bound involves the global norms of the function f and g, not their values at particular points. For example, we may note that if two chebfuns f and g give (f – g)(x) < 0 at a point x, then from Equation (3), we cannot conclude that f (x) < g(x). We can conclude, however, that there are nearby chebfuns and with (x) < (x). This is related to the “zero problem” that comes up in the theory of real computation.24 It is well known that the problem of determining the sign of a difference of real numbers with guaranteed accuracy poses difficulties. However, Chebfun makes no claim to overcome these difficulties: the normwise condition of Equation (3) promises less.
Does it promise enough to be useful? What strings of computations in a system satisfying Equation 3 at each step can be expected to be satisfactory? This is nothing less than the problem of stability of Chebfun algorithms, and it is a major topic for future research. Certainly, there may be applications where Equation (3) is not enough to imply what one would like typically for reasons related to the zero problem. For example, this may happen in some problems of geometry, where arbitrarily small coordinate errors may make the difference between two bodies intersecting or not intersecting or between convex and concave. On the other hand, generations of numerical analysts have found that such difficulties are by no means universal, that the backward stability condition of Equation (2) for floating-point arithmetic is sufficient to ensure success for many scientific computations. An aim of ours for the future will be to determine how far this conclusion carries over to the condition of Equation (3) for chebfuns.
7. Chebfun Software Project
Chebfun began in 2002 as a few hundred lines of MATLAB code, written by Zachary Battles, for computing with global polynomial representations of smooth functions on [–1, 1], and this “core Chebfun” framework has been the setting for the discussion in this article. But in fact, the project has expanded greatly in the decade since then, both as a software effort and in its computational capabilities.
In terms of software, we have grown to an open-source project hosted on GitHub with currently about a dozen developers, most but not all based at Oxford. The code is written in MATLAB, which is a natural choice for this kind of work because of its vector and matrix operations, although implementations of parts of core Chebfun have been produced by various people in other languages including Python, C, Julia, Maxima, and Octave. To date, there have been about 20,000 Chebfun downloads. We interact regularly with users through bug reports, help requests by email, and other communications, but we believe we are not alone among software projects in feeling that we have an inadequate understanding of who our users are and what they are doing.
In terms of capabilities, here are some of the developments beyond the core ideas emphasized in this article. The abbreviations ODE and PDE stand for ordinary and partial differential equations.
- piecewise smooth functions16
- periodic functions (Fourier not Chebyshev)7
- fast edge detection for determining breakpoints16
- infinite intervals [a, ∞), (–∞, b], (–∞, ∞)
- functions with poles and other singularities
- delta functions of arbitrary order
- Padé, Remez, CF rational approximations8, 17, 23
- fast Gauss and Gauss–Jacobi quadrature9, 11
- fast Chebyshev ↔ Legendre conversions10
- continuous QR factorization, SVD, least-squares1, 21
- representation of linear operators6
- solution of linear ODEs6
- solution of integral equations5
- solution of eigenvalue problems6
- exponentials of linear operators6
- Fréchet derivatives via automatic differentiation2
- solution of nonlinear ODEs2
- PDEs in one space variable plus time
- Chebgui interface to ODE/PDE capabilities
- Chebfun2 extension to rectangles in 2D19, 20
We shall not attempt to describe these developments, but here are a few comments. For solving ODE boundary value problems, whether scalars or systems and smooth or just piecewise smooth, Chebfun and its interface Chebgui have emerged as the most convenient and flexible tool in existence, making it possible to solve all kinds of problems with minimal effort with accuracy close to machine precision (these developments are due especially to Ásgeir Birkisson, Toby Driscoll, and Nick Hale).2 For computing quadrature nodes and weights, convolution, and conversion between Legendre and Chebyshev coefficient representations, Chebfun contains codes implementing new algorithms that represent the state of the art, enabling machine accuracy for even millions of points in seconds (these developments are due to Nick Hale, Alex Townsend, and Ignace Bogaert3, 9, 10). Extensions to multiple dimensions have begun with Alex Townsend’s Chebfun2 code initially released in 2013.19, 20
The best way to get a sense of the wide range of problems that can be solved by this kind of computing is to look at the collection of Chebfun Examples available online at the web site www.chebfun.org. Approaching 200 in number, the examples are organized under headings that look like chapters of a numerical analysis textbook (optimization, quadrature, linear algebra, geometry, …), with dozens of short discussions in each category of problems ranging from elementary to advanced.
Here is an example that gives a taste of Chebfun’s ability to work with functions that are only piecewise smooth and to solve ODE eigenvalue problems. The sequence
produces the plot shown in Figure 1 as well as associated numerical output. The figure shows the first 10 eigenmodes of a Schrödinger operator –h2∂2u/∂x2 + V(x)u(x) with the default value of Planck’s constant h = 0.1. The potential function V(x) consists of the parabola x2/2 over the interval [−2, 2] maximized with a triangular barrier around x = 0, and it is represented by a piecewise-smooth chebfun with four pieces. This kind of mathematics arises in any introductory quantum mechanics course; Chebfun makes exploring the dependence of eigenstates on potential functions almost effortless, yet with accuracy close to machine precision.
And here is an example that gives a taste of Chebfun-like computing on rectangles in 2D as implemented by Townsend’s extension Chebfun2. The sequence
defines and plots a chebfun2 representing an oscillatory function of x and y on the unit square [–1, 1]2, as shown in Figure 2. The command max2
tells us its global maximum in a fraction of a second:
The algorithms underlying Chebfun2 are described in Townsend and Trefethen.19, 20
8. Conclusion
Chebfun is being used by scientists and engineers around the world to solve one-dimensional and two-dimensional numerical problems without having to think about the underlying discretizations. The Chebyshev technology it is built on is powerful, and it is hard to see any serious competition for this kind of high-accuracy representation of functions in 1D.
At the same time, the deeper point of this article has been to put forward a vision that is not tied specifically to Chebyshev expansions or to other details of Chebfun. The vision is that by the use of adaptive high-accuracy numerical approximations of functions, computational systems can be built that “feel symbolic but run at the speed of numerics.”
Acknowledgments
In addition to the leaders mentioned at the beginning of Section 4, other contributors to the Chebfun project have included: Anthony Austin, Folkmar Bornemann, Filomena di Tommaso, Pedro Gonnet, Stefan Güttel, Hrothgar, Mohsin Javed, Georges Klein, Hadrien Montanelli, Sheehan Olver, Ricardo Pachón, Rodrigo Platte, Mark Richardson, Joris Van Deun, Grady Wright, and Kuan Xu. It has been a fascinating experience working with these people over the past decade to rethink so much of discrete numerical mathematics in a continuous mode.
During 2008–2011, the Chebfun project was supported by the UK Engineering and Physical Sciences Council. Currently, we are supported by MathWorks, Inc. and by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007–2013)/ERC grant agreement no. 291068. The views expressed in this article are not those of the ERC or the European Commission, and the European Union is not liable for any use that may be made of the information contained here.
Figures
Figure 1. Schrödinger eigenstates computed by quantumstates (V),
where V
is a chebfun representing a piecewise smooth potential function.
Figure 2. Two-dimensional extension of Chebfun: an oscillatory function represented by a chebfun2, with its maximum shown as a black dot.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment