Multigrain parallelism for eigenvalue computations on networks of clusters

J.R. McCombs1, A. Stathopoulos1
1Department of Computer Science, College of William and Mary, Williamsburg, VA, USA

Tóm tắt

Clusters of workstations have become a cost-effective means of performing scientific computations. However, large network latencies, resource sharing, and heterogeneity found in networks of clusters and Grids can impede the performance of applications not specifically tailored for use in such environments. A typical example is the traditional fine grain implementations of Krylov-like iterative methods, a central component in many scientific applications. To exploit the potential of these environments, advances in networking technology must be complemented by advances in parallel algorithmic design. In this paper, we present an algorithmic technique that increases the granularity of parallel block iterative methods by inducing additional work during the preconditioning (inexact solution) phase of the iteration. During this phase, each vector in the block is preconditioned by a different subgroup of processors, yielding a much coarser granularity. The rest of the method comprises a small portion of the total time and is still implemented in fine grain. We call this combination of fine and coarse grain parallelism multigrain. We apply this idea to the block Jacobi-Davidson eigensolver, and present experimental data that shows the significant reduction of latency effects on networks of clusters of roughly equal capacity and size. We conclude with a discussion on how multigrain can be applied dynamically based on runtime network performance monitoring.

Từ khóa

#Parallel processing #Eigenvalues and eigenfunctions #Computer networks #Concurrent computing #Delay #Iterative methods #Iterative algorithms #Workstations #Resource management #Impedance

Tài liệu tham khảo

karypis, 1998, A parallel algorithm for multilevel graph partitioning and sparse matrix ordering, Journal of Parallel and Distributed Computing, 48, 71, 10.1006/jpdc.1997.1403 10.1109/HPDC.1998.709972 mccombs, 2002, Dynamic load balancing of an iterative eigensolver on Grids of heterogeneous clusters 10.1016/0167-8191(87)90013-5 10.1137/1.9781611971163 10.1137/0914028 saad, 1996, Iterative methods for sparse linear systems saad, 1994, Parallel SPARSe matrix LIBrary (P_SPARSLIB): the iterative solvers module, Technical Report 94-008 Army High Performance Computing Research Center 10.1007/BF01731936 10.1137/S0895479894270427 10.1137/1.9780898719611 concus, 1976, A generalized conjugate gradient method for the numeric solution of elliptic partial differential equations foster, 1998, The Grid-Blueprint for a New Computing Infrastructure 10.1177/109434209701100205 hwang, 1998, Scalable Parallel Computing 10.1145/242857.242867 chow, 2001, ParaSails: Parallel sparse approximate inverse (least-squares) preconditioner, Center for Applied Scientific Computing Lawrence Livermore National Laboratory 10.1016/S0167-8191(01)00077-1 karypis, 1995, Metis: unstructured graph partitioning and sparse matrix ordering system, Technical Report Department of Computer Science University of Minnesota 10.1145/169627.169685 10.1016/S0167-739X(99)00025-4 stathopoulos, 1999, A parallel, block, Jacobi-Davidson implementation for solving large eigen-problems on coarse grain environments, 1999 International Conference on Parallel and Distributed Processing Techniques and Applications, 2920