Efficient evaluation methods of elementary functions suitable for SIMD computation

Naoki Shibata1
1Department of Information Processing & Management, Shiga University, Hikone, Japan

Tóm tắt

Từ khóa


Tài liệu tham khảo

Seiler L, Carmean D, Sprangle E, Forsyth T, Abrash M, Dubey P, Junkins S, Lake A, Sugerman J, Cavin R, Espasa R, Grochowski E, Juan T, Hanrahan P (2008) Larrabee: a many-core x86 architecture for visual computing. In: Proceedings of ACM SIGGRAPH 2008, pp 1–15

Barker K, Davis K, Hoisie A, Kerbyson D, Lang M, Pakin S, Sancho J (2008) Entering the petaflop era: the architecture and performance of Roadrunner. In: Proceedings of the 2008 ACM/IEEE conference on supercomputing, pp 1–11

Gschwind M, Hofstee H, Flachs B, Hopkins M, Watanabe Y, Yamazaki T (2006) Synergistic processing in cell’s multicore architecture. IEEE MICRO 26(2):10–24

Tesla C2050 and Tesla C2070 Computing Processor Board. http://www.nvidia.com/docs/IO/43395/BD-04983-001_v01.pdf

Thakkar S, Huff T (1999) Internet streaming SIMD extensions. Computer 32(12):26–34

Approximate Math Library 2.0. http://www.intel.com/design/pentiumiii/devtools/AMaths.zip

Simple SSE SSE2 optimized sin, cos, log and exp. http://gruntthepeon.free.fr/ssemath/

Nyland L, Snyder M Fast trigonometric functions using Intel’s SSE2 instructions. Tech Report

Linux Kernel Version 2.6.30.5. http://www.kernel.org/

GNU C Library Version 2.7. http://www.gnu.org/software/libc/

Brent R (2006) Fast algorithms for high-precision computation of elementary functions. In: Proceedings of 7th conference on real numbers and computers (RNC 7), pp 7–8

The MPFR Library. http://www.mpfr.org/

Detrey J, Dinechin F, Pujul X (2007) Return of the hardware floating-point elementary function. In: Proceedings of the 18th IEEE symposium on computer arithmetic, pp 161–168

Koren I, Zinaty O (1990) Evaluating elementary functions in a numerical coprocessor based on rational approximations. IEEE Trans Comput 39(8):1030–1037

Ercegovac M, Lang T, Muller J, Tisserand A (2000) Reciprocation, square root, inverse square root, and some elementary functions using small multipliers. IEEE Trans Comput 49(7):628–637

Goto E, Wong WF (1995) Fast evaluation of the elementary functions in single precision. IEEE Trans Comput 44(3):453–457

Scarpazza D, Russell G (2009) High-performance regular expression scanning on the Cell/B.E. processor. In: Proceedings of the 23rd international conference on supercomputing, pp 14–25

Rehman M, Kothapalli K, Narayanan P (2009) Fast and scalable list ranking on the GPU. In: Proceedings of the 23rd international conference on supercomputing, pp 235–243

Goldberg D (1991) What every computer scientist should know about floating-point arithmetic. ACM Comput Surv 23(1):5–48

Abramowitz M, Stegun I (1965) Handbook of mathematical functions: with formulas, graphs, and mathematical tables. Dover, New York