DAGuE: A generic distributed DAG engine for High Performance Computing

Parallel Computing - Tập 38 - Trang 37-51 - 2012

George Bosilca¹, Aurelien Bouteiller¹, Anthony Danalis¹, Thomas Herault¹, Pierre Lemarinier², Jack Dongarra^1,3

¹Innovative Computing Laboratory, The University of Tennessee, United States

²IRISA, Université de Rennes 1, France

³Oak Ridge National Laboratory, United States

Tài liệu tham khảo

Bernstein, 1966, Analysis of programs for parallel processing, IEEE Transactions on Electronic Computers, EC-15, 757, 10.1109/PGEC.1966.264565 E.G. Coffman, Jr., P.J. Denning, Operating Systems Theory, Prentice Hall Professional Technical Reference, 1973. 1992 J. Yu, R. Buyya, A taxonomy of workflow management systems for grid computing, Tech. rep., Journal of Grid Computing, 2005. O. Delannoy, N. Emad, S. Petiton, Workflow global computing with YML, in: 7th IEEE/ACM International Conference on Grid Computing, 2006. Buttari, 2006, The impact of multicore on math software, vol. 4699, 1 Chan, 2008, Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks, 123 E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, S. Tomov, Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series 180. R. Dolbeau, S. Bihan, F. Bodin, HMPP: A hybrid multi-core parallel programming environment, in: Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), 2007. Augonnet, 2011, StarPU: a unified platform for task scheduling on heterogeneous multicore architectures, Concurrency and Computation: Practice and Experience, 23, 187, 10.1002/cpe.1631 J. Perez, R. Badia, J. Labarta, A dependency-aware task-based programming environment for multi-core architectures, in: IEEE International Conference on Cluster Computing, 2008, pp. 142–151. Song, 2009, Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems, 1 C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier, StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Euro-Par 2009 Euro-par’09 Proceedings, LNCS, Delft Pays-Bas, 2009. Cosnard, 2001, Automatic parallelization techniques based on compact DAG extraction and symbolic scheduling, Parallel Processing Letters, 11, 151, 10.1142/S012962640100049X Cosnard, 2004, Compact DAG representation and its symbolic scheduling, Journal of Parallel and Distributed Computing, 64, 921, 10.1016/j.jpdc.2004.05.001 E. Jeannot, Automatic multithreaded parallel program generation for message passing multiprocessors using parameterized task graphs, in: International Conference ‘Parallel Computing 2001’ (ParCo2001), 2001. Husbands, 2007, Multi-threading and one-sided communication in parallel lu factorization Gustavson, 2009, Distributed SBP cholesky factorization algorithms with near-optimal scheduling, ACM Transactions on Mathematical Software, 36, 1, 10.1145/1499096.1499100 W. Pugh, The omega test: a fast and practical integer programming algorithm for dependence analysis, in: Supercomputing ’91: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, New York, NY, USA, 1991, pp. 4–13. U.A. Acar, G.E. Blelloch, R.D. Blumofe, The data locality of work stealing., in: SPAA’00, 2000, pp. 1–12. F. Broquedis, J. Clet Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst, hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: IEEE (Ed.), PDP 2010 - The 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, Pisa Italy, 2010. Gustavson, 2006, Minimal data copy for dense linear algebra factorization, vol. 4699, 540 G.W. Stewart, Matrix algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2001. Buttari, 2009, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computation, 35, 38, 10.1016/j.parco.2008.10.002 Buttari, 2008, Parallel tiled QR factorization for multicore architectures, Concurrency Computation: Practice and Experience, 20, 1573, 10.1002/cpe.1301 Schreiber, 1991, A storage-efficient WY representation for products of householder transformations, J. Sci. Stat. Comput.***, 10, 53, 10.1137/0910005 Quintana-Ortí, 2008, Updating an LU factorization with pivoting, ACM Transactions on Mathematical Software, 35, 11, 10.1145/1377612.1377615 Bolze, 2006, Grid’5000: a large scale and highly reconfigurable experimental grid testbed, IJHPCA, 20, 481 Blackford, 1997, ScaLAPACK: a linear algebra library for message-passing computers Dongarra, 2003, The LINPACK benchmark: past, present and future, Concurrency and Computation: Practice and Experience, 15, 803, 10.1002/cpe.728 Choi, 1995, ScaLAPACK: a portable linear algebra library for distributed memory computers – design issues and performance, vol. 1041, 95 Q.O. Snell, A.R. Mikler, J.L. Gustafson, Netpipe: A network protocol independent performance evaluator, in: IASTED International Conference on Intelligent Information Management and Systems, 1996. J. Dongarra, P. Beckman, et al., The international exascale software project roadmap, Tech. rep., IESP, 2011, http://www.exascale.org/iesp.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA