What every computer scientist should know about floating-point arithmetic

ACM Computing Surveys - Tập 23 Số 1 - Trang 5-48 - 1991
David Theo Goldberg1
1Xerox Palo Alto Research Center, Palo Alto, CA

Tóm tắt

Floating-point arithmetic is considered as esoteric subject by many people. This is rather surprising, because floating-point is ubiquitous in computer systems: Almost every language has a floating-point datatype; computers from PCs to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow. This paper presents a tutorial on the aspects of floating-point that have a direct impact on designers of computer systems. It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating point standard, and concludes with examples of how computer system builders can better support floating point.

Từ khóa


Tài liệu tham khảo

AHO A. V., 1986, Compilers: Principles, Techniques and Tools

ANSI 1978. American National Standard Programming Language FORTRAN ANSI Standard X3.9-1978. American National Standards Institute New York. ANSI 1978. American National Standard Programming Language FORTRAN ANSI Standard X3.9-1978. American National Standards Institute New York.

BARNETT D. 1987. A portable floating-point environment. Unpublished manuscript. BARNETT D. 1987. A portable floating-point environment. Unpublished manuscript.

10.1145/355972.355975

CARDELLI L. DONAHUE J. GLASSMAN L. JORDAN M. KASLOW B. AND NELSON G. 1989. Modula-3 Report (revised). Digital Systems Research Center Report *~52 Palo Alto Calif. CARDELLI L. DONAHUE J. GLASSMAN L. JORDAN M. KASLOW B. AND NELSON G. 1989. Modula-3 Report (revised). Digital Systems Research Center Report *~52 Palo Alto Calif.

CODY W. J., 1984, A proposed radix- and word-length-independent standard for floatingpoint arithmetic, IEEE Micro, 4, 86, 10.1109/MM.1984.291224

CODY W. J., Reliability in Computing: The Role of lnterval Methods on Scientific Computing, Ramon E, 99

COONEN J. 1984. Contributions to a proposed standard for binary floating-point arithmetic. PhD dissertation Univ. of California Berkeley. COONEN J. 1984. Contributions to a proposed standard for binary floating-point arithmetic. PhD dissertation Univ. of California Berkeley.

DEKKER T. J., 1971, A floating-point technique for extending the available precision, Numer. Math., 18, 224, 10.1007/BF01397083

DEMMEL J., 1984, Underflow and the reliability of numerical software, SIAM J. Sci. Stat. Cornput., 5, 887, 10.1137/0905062

10.1002/spe.4380180709

FORSYTHE G. E. AND MOLER C. B. 1967. Computer Solutmn of Linear Algebraic Systems. Prentice-Hall Englewood Cliffs N.J. FORSYTHE G. E. AND MOLER C. B. 1967. Computer Solutmn of Linear Algebraic Systems. Prentice-Hall Englewood Cliffs N.J.

10.1145/363067.363112

GOLDBERC D., Computer Architecture: A Quantitative Approach, David Patterson and John L

GOLUB G. H. AND VAN LOAN C. F. 1989. Matrix Computations. The Johns Hopkins University Press Baltimore MD. GOLUB G. H. AND VAN LOAN C. F. 1989. Matrix Computations. The Johns Hopkins University Press Baltimore MD.

HEWLETT, 1982, PACKARD

IEEE, 1987, IEEE Standard 754-1985 for Binary Floating-Point Arithmetic, IEEE. Reprinted in SIGPLAN, 22, 9

KASAN W., 1972, A Survey of Error Analysis, Information Processing 71, (Ljubljana, Yugoslavia), North Holland, Amsterdam, 2, 1214

KAHAN W. 1986. Calculating Area and Angle of a Needle-like Triangle. Unpublished manuscript. KAHAN W. 1986. Calculating Area and Angle of a Needle-like Triangle. Unpublished manuscript.

KAHAN W., The State of the Art in Numerical Analyszs

KAnAN W. 1988. Unpublished lectures given at Sun Microsystems Mountain View Calif. KAnAN W. 1988. Unpublished lectures given at Sun Microsystems Mountain View Calif.

KAHAN W., The Relationship between Numerical Computation and Programmi,g Languages, 103

KAI~AN W., 1985, Proceedings of the 7th IEEE Symposium on Computer Arithmetic, 322

KERXIGHAN B. W. AND RITCHm D. M. 1978. The C Programming Language. Prentice~Hall Englewood Cliffs N.J. KERXIGHAN B. W. AND RITCHm D. M. 1978. The C Programming Language. Prentice~Hall Englewood Cliffs N.J.

KIRCHNER R., U 1987 Arithmetic for vector processors. In Proceedings of the 8th IEEE Symposium on Computer Arithmetic, 256

KNUT~ D. E., 1981, Mass.

10.1137/1028001

MATULA D. W., 1985, Finite Precision Rational Arithmetic: Slash Number Systems, IEEE Trans. Comput. C-34, 1, 3, 10.1109/TC.1985.1676511

REISER J. F., 1975, Evading the drift in floating-point addition, Inf. Process. Lett, 3, 84, 10.1016/0020-0190(75)90022-8

STERBETZ P. H. 1974. Floating-Point Computation. Prentice-Hall Englewood Cliffs N.J. STERBETZ P. H. 1974. Floating-Point Computation. Prentice-Hall Englewood Cliffs N.J.

SWARTZLANDER E. E, 1975, The sign/logarithm number system, IEEE Trans. Comput. C-24, 12, 1238, 10.1109/T-C.1975.224172

WALTHER J. S., 1971, Proceedings of the AFIP Spr~ng Joint Computer Conference, 379