Parallelizing dense and banded linear algebra libraries using SMPSs
Tóm tắt
Từ khóa
Tài liệu tham khảo
Anderson E, 1992, LAPACK Users' Guide
ChanE Quintana‐OrtíES Quintana‐OrtíG van de GeijnR.Super matrix out‐of‐order scheduling of matrix operations for SMP and multi‐core architectures. Proceedings of the Nineteenth ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2007) San Diego CA U.S.A. 9–11 June 2007;116–125.
ChanE Van ZeeFG Quintana‐OrtíES Quintana‐OrtíG van de GeijnR.Satisfying your dependencies with super matrix. Proceedings of IEEE Cluster Computing 2007 September 2007;91–99.
Quintana‐OrtíG Quintana‐OrtíES ChanE van de GeijnR Van ZeeFG.Design of scalable dense linear algebra libraries for multithreaded architectures: the LU factorization. Workshop on Multithreaded Architectures and Applications—MTAAP 2008 2008. CD‐ROM.
Quintana‐OrtíG Quintana‐OrtíES ChanE Van ZeeFG van de GeijnRA.Scheduling of QR factorization algorithms on SMP and multi‐core architectures. In 16th Euromicro International Conference on Parallel Distributed and Network‐based Processing—PDP 2008 El Baz FSD Bourgeois J (eds.). 2008;301–310.
ChanE Van ZeeFG BientinesiP Quintana‐OrtíES Quintana‐OrtíG van de GeijnR.Super matrix: A multithreaded runtime scheduling system for algorithms‐by‐blocks. ACM SIGPLAN 2008 Symposium on Principles and Practices of Parallel Programming ( PPoPP'08) 2008;123–132.
Quintana‐OrtíG Quintana‐OrtíES RemónA van de GeijnR.Supermatrix for the factorization of band matrices. FLAME Working Note #27 TR‐07‐51 The University of Texas at Austin Department of Computer Sciences September2007.
Pérez JM, 2007, CellSs programming the Cell/B.E. made easier, IBM Journal of Research and Development, 51
PérezJM BadiaRM LabartaJ.A flexible and portable programming model for SMP and multi‐cores. Technical Report 03/2007 Barcelona Supercomputing Center—Centro Nacional de Supercomputación Barcelona Spain 2007.
PérezJM BadiaRM LabartaJ.A dependency‐aware task‐based programming environment for multi‐core architectures. Proceedings of the 2008 IEEE International Conference on Cluster Computing Causal Productions (ed.). September 2008;142–151. IEEE Catalog Number CFP08235‐CDR.
Golub GH, 1996, Matrix Computations
StrazdinsP.A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. Technical Report TR‐CS‐98‐07 Department of Computer Science The Australian National University Canberra 0200 ACT Australia 1998.
Joffrain T, 2005, Proceedings of PARA 2004, 413
Gustavson FG, 2000, The Architecture of Scientific Software, 211
HerreroJR.A framework for efficient execution of matrix computations. PhD Thesis Polytechnic University of Catalonia Spain 2006.
LowTM van de GejinR.An API for manipulating matrices stored by blocks. Technical Report TR‐2004‐15 Department of Computer Sciences The University of Texas at Austin May2004.