Research and Advances

On accurate floating-point summation

Posted

cumulation of floating-point sums is considered on a computer which performs t-digit base &bgr; floating-point addition with exponents in the range —m to M. An algorithm is given for accurately summing n t-digit floating-point numbers. Each of these n numbers is split into q parts, forming q·n t-digit floating-point numbers. Each of these is then added to the appropriate one of &eegr; auxiliary t-digit accumulators. Finally, the accumulators are added together to yield the computed sum. In all, q·n + &eegr; - 1 t-digit floating-point additions are performed. Let &ngr; = ⌈(M + m + 1)/(&eegr; + 1)⌉. If n ≤ (1/q)&bgr;⌈((q-1)/q)t⌈-&ngr;+1 (*), then the relative error in the computed sum is at most ⌈(t + 1)/&ngr;⌉&bgr;1-t. Further, with an additional q + &eegr; - 1 t-digit additions, the computed sum can be corrected to full t-digit accuracy. For example, for the IBM/360 (&bgr; = 16, t = 14, M = 63, m = 64), typical values for q and &eegr; are q = 2 and &eegr; = 32. In this case, (*) becomes n ≤ 1/2 × 164 = 32,768, and we have ⌈(t + 1)/&ngr;⌉&bgr;1-t = 4 × 16-13.

View this article in the ACM Digital Library.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More