Optimized HPL for AMD GPU and multi-core CPU usage
Tóm tắt
Từ khóa
Tài liệu tham khảo
Advanced Micro Devices: AMD stream computing guide. URL http://developer.amd.com/gpu/ATIStreamSDK/assets/ATI_Stream_SDK_OpenCL_Programming_Guide.pdf
Amdahl G (1967) Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS conference proceedings, vol 30, pp 483–485
Drepper U (2007) What every programmer should know about memory. URL http://www.akkadia.org/drepper/cpumemory.pdf
Goethe University of Frankfurt Center for Scientific Computing: LOEWE-CSC cluster. URL http://csc.uni-frankfurt.de/csc/?51
Intel Corporation (2009) Intel threading building blocks reference manual. URL http://software.intel.com/sites/products/documentation/hpc/tbb/reference.pdf
Nakasato N (2010) A fast GEMM implementation on a cypress GPU. URL http://www.dcs.warwick.ac.uk/~sdh/pmbs10/pmbs10/Workshop_Programme_files/fastgemm.pdf
NVIDIA Corporation: CUBLAS library. URL http://developer.download.nvidia.com/compute/cuda/1_0/CUBLAS_Library_1.0.pdf
Rohr D, Kretz M, Bach M (2010) Technical report, CALDGEMM and HPL. URL http://code.compeng.uni-frankfurt.de/attachments/10/techreport.pdf
Texas Advanced Computing Center: GotoBLAS basic linear algebra library. URL http://www.tacc.utexas.edu/tacc-projects/
University of Tennesse: High performance Linpack algorithm. URL http://www.netlib.org/benchmark/hpl/algorithm.html
Volkov V, Demmel J (2008) Benchmarking GPUs to tune dense linear algebra. In: SC 08 ACM/IEEE conference on supercomputing proceedings, pp 1–11