What every computer scientist should know about floating-point arithmetic
Tóm tắt
Floating-point arithmetic is considered as esoteric subject by many people. This is rather surprising, because floating-point is ubiquitous in computer systems: Almost every language has a floating-point datatype; computers from PCs to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow. This paper presents a tutorial on the aspects of floating-point that have a direct impact on designers of computer systems. It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating point standard, and concludes with examples of how computer system builders can better support floating point.
Từ khóa
Tài liệu tham khảo
AHO A. V., 1986, Compilers: Principles, Techniques and Tools
ANSI 1978. American National Standard Programming Language FORTRAN ANSI Standard X3.9-1978. American National Standards Institute New York. ANSI 1978. American National Standard Programming Language FORTRAN ANSI Standard X3.9-1978. American National Standards Institute New York.
BARNETT D. 1987. A portable floating-point environment. Unpublished manuscript. BARNETT D. 1987. A portable floating-point environment. Unpublished manuscript.
CARDELLI L. DONAHUE J. GLASSMAN L. JORDAN M. KASLOW B. AND NELSON G. 1989. Modula-3 Report (revised). Digital Systems Research Center Report *~52 Palo Alto Calif. CARDELLI L. DONAHUE J. GLASSMAN L. JORDAN M. KASLOW B. AND NELSON G. 1989. Modula-3 Report (revised). Digital Systems Research Center Report *~52 Palo Alto Calif.
CODY W. J., 1984, A proposed radix- and word-length-independent standard for floatingpoint arithmetic, IEEE Micro, 4, 86, 10.1109/MM.1984.291224
CODY W. J., Reliability in Computing: The Role of lnterval Methods on Scientific Computing, Ramon E, 99
COONEN J. 1984. Contributions to a proposed standard for binary floating-point arithmetic. PhD dissertation Univ. of California Berkeley. COONEN J. 1984. Contributions to a proposed standard for binary floating-point arithmetic. PhD dissertation Univ. of California Berkeley.
DEKKER T. J., 1971, A floating-point technique for extending the available precision, Numer. Math., 18, 224, 10.1007/BF01397083
DEMMEL J., 1984, Underflow and the reliability of numerical software, SIAM J. Sci. Stat. Cornput., 5, 887, 10.1137/0905062
FORSYTHE G. E. AND MOLER C. B. 1967. Computer Solutmn of Linear Algebraic Systems. Prentice-Hall Englewood Cliffs N.J. FORSYTHE G. E. AND MOLER C. B. 1967. Computer Solutmn of Linear Algebraic Systems. Prentice-Hall Englewood Cliffs N.J.
GOLDBERC D., Computer Architecture: A Quantitative Approach, David Patterson and John L
GOLUB G. H. AND VAN LOAN C. F. 1989. Matrix Computations. The Johns Hopkins University Press Baltimore MD. GOLUB G. H. AND VAN LOAN C. F. 1989. Matrix Computations. The Johns Hopkins University Press Baltimore MD.
HEWLETT, 1982, PACKARD
IEEE, 1987, IEEE Standard 754-1985 for Binary Floating-Point Arithmetic, IEEE. Reprinted in SIGPLAN, 22, 9
KASAN W., 1972, A Survey of Error Analysis, Information Processing 71, (Ljubljana, Yugoslavia), North Holland, Amsterdam, 2, 1214
KAHAN W. 1986. Calculating Area and Angle of a Needle-like Triangle. Unpublished manuscript. KAHAN W. 1986. Calculating Area and Angle of a Needle-like Triangle. Unpublished manuscript.
KAHAN W., The State of the Art in Numerical Analyszs
KAnAN W. 1988. Unpublished lectures given at Sun Microsystems Mountain View Calif. KAnAN W. 1988. Unpublished lectures given at Sun Microsystems Mountain View Calif.
KAHAN W., The Relationship between Numerical Computation and Programmi,g Languages, 103
KAI~AN W., 1985, Proceedings of the 7th IEEE Symposium on Computer Arithmetic, 322
KERXIGHAN B. W. AND RITCHm D. M. 1978. The C Programming Language. Prentice~Hall Englewood Cliffs N.J. KERXIGHAN B. W. AND RITCHm D. M. 1978. The C Programming Language. Prentice~Hall Englewood Cliffs N.J.
KIRCHNER R., U 1987 Arithmetic for vector processors. In Proceedings of the 8th IEEE Symposium on Computer Arithmetic, 256
KNUT~ D. E., 1981, Mass.
MATULA D. W., 1985, Finite Precision Rational Arithmetic: Slash Number Systems, IEEE Trans. Comput. C-34, 1, 3, 10.1109/TC.1985.1676511
REISER J. F., 1975, Evading the drift in floating-point addition, Inf. Process. Lett, 3, 84, 10.1016/0020-0190(75)90022-8
STERBETZ P. H. 1974. Floating-Point Computation. Prentice-Hall Englewood Cliffs N.J. STERBETZ P. H. 1974. Floating-Point Computation. Prentice-Hall Englewood Cliffs N.J.
SWARTZLANDER E. E, 1975, The sign/logarithm number system, IEEE Trans. Comput. C-24, 12, 1238, 10.1109/T-C.1975.224172
WALTHER J. S., 1971, Proceedings of the AFIP Spr~ng Joint Computer Conference, 379