High-performance parallel implicit CFD

Parallel Computing - Tập 27 Số 4 - Trang 337-362 - 2001
William Gropp1, D. Kaushik1, David E. Keyes2, Barry Smith3
1Mathematics and Computer Science Division Argonne National Laboratory#R# Argonne, IL 60439, USA,
2Old Dominion University
3Argonne National Laboratory

Tóm tắt

Từ khóa


Tài liệu tham khảo

Anderson, 1994, An implicit upwind algorithm for computing turbulent flows on unstructured grids, Comput. Fluids, 23, 1, 10.1016/0045-7930(94)90023-X

Anderson, 1996, Implicit/multigrid algorithms for incompressible turbulent flows on unstructured grids, J. Comput. Phys., 128, 391, 10.1006/jcph.1996.0219

S. Balay, W.D. Gropp, L.C. McInnes, B.F. Smith, The Portable Extensible Toolkit for Scientific Computing (PETSc) version 28, http://www.mcs.anl.gov/petsc/petsc.html, 2000

Bova, 2000, Dual-level parallel analysis of harbor wave response using MPI and OpenMP, Int. J. High Performance Comput. Appl., 14, 49, 10.1177/109434200001400104

X.-C. Cai, Some domain decomposition algorithms for nonselfadjoint elliptic and parabolic partial differential equations, Technical Report 461, Courant Institute, New York, 1989

Cai, 1999, A restricted additive Schwarz preconditioner for general sparse linear systems, SIAM J. Scientific Comput., 21, 792, 10.1137/S106482759732678X

E. Cuthill, J. McKee, Reducing the bandwidth of sparse symmetric matrices, in: Proceedings of the 24th National Conference of the ACM, 1969

Dembo, 1982, Inexact Newton methods, SIAM J. Numer. Anal., 19, 400, 10.1137/0719025

J. Dongarra, H.-W. Meuer, E. Strohmaier, The TOP 500 List, http://www.netlib.org/benchmark/top500.html, 2000

W.D. Gropp, D.K. Kaushik, D.E. Keyes, B.F. Smith, Toward realistic performance bounds for implicit CFD codes, in: D. Keyes, A. Ecer, J. Periaux, N. Satofuka, P. Fox (Eds.), Proceedings of the Parallel CFD'99, Elsevier, Berlin, 1999, pp. 233–240

W.D. Gropp, D.K. Kaushik D.E. Keyes, B.F. Smith, Performance modeling and tuning of an unstructured mesh CFD application, in: Proceedings of the SC2000, IEEE Computer Society, 2000

Gropp, 2000, Globalized Newton–Krylov–Schwarz algorithms and software for parallel implicit CFD, Int. J. High Performance Comput. Appl., 14, 102, 10.1177/109434200001400202

Gropp, 1999

Hennessy, 1996

P.D. Hough, T.G. Kolda, V.J. Torczon, Asynchronous parallel pattern search for nonlinear optimization, Technical Report SAND2000-8213, Sandia National Laboratories, Livermore, January 2000

Karypis, 1999, A fast and high quality scheme for partitioning irregular graphs, SIAM J. Scientific Comput., 20, 359, 10.1137/S1064827595287997

Kelley, 1998, Convergence analysis of pseudo-transient continuation, SIAM J. Numer. Anal., 35, 508, 10.1137/S0036142996304796

D.E. Keyes, How scalable is domain decomposition in practice? in: C.-H. Lai et al. (Eds.), Proceedings of the 11th International Conference on Domain Decomposition Methods, Domain Decomposition Press, Bergen, 1999

D.E. Keyes, Four horizons for enhancing the performance of parallel simulations based on partial differential equations, in: Proceedings of the Europar 2000, Lecture Notes in Computer Science, Springer, Berlin, 2000

D.J. Mavriplis, Parallel unstructured mesh analysis of high-lift configurations, Technical Report 2000-0923, AIAA, 2000

J.D. McCalpin, STREAM: Sustainable memory bandwidth in high performance computers, Technical report, University of Virginia, 1995, http://www.cs.virginia.edu/stream

MIPS Technologies, Inc., http://techpubs.sgi.com/library/manuals/2000/007-2490-001/pdf/007-2490-001.pdf. MIPS R10000 Microprocessor User's Manual, January 1997

Mulder, 1985, Experiments with implicit upwind methods for the Euler equations, J. Comput. Phys., 59, 232, 10.1016/0021-9991(85)90144-5

Silicon Graphics, Inc, http://techpubs.sgi.com/library/manuals/3000/007-3430-002/pdf/007-3430-002.pdf. Origin 2000 and Onyx2 Performance and Tuning Optimization Guide, 1998, Document Number 007-3430-002

Smith, 1996

O. Temam, W. Jalby, Characterizing the behavior of sparse algorithms on caches, in: Proceedings of the Supercomputing 92, IEEE Computer Society, 1992, pp. 578–587

Toledo, 1997, Improving the memory-system performance of sparse-matrix vector multiplication, IBM J. Res. Dev., 41, 711, 10.1147/rd.416.0711

Wang, 1999, Performance enhancements on microprocessors with hierarchical memory systems for solving large sparse linear systems, Int. J. High Performance Comput. Appl., 13, 63, 10.1177/109434209901300104

J. White, P. Sadayappan, On improving the performance of sparse matrix–vector multiplication, in: Proceedings of the Fourth International Conference on High Performance Computing (HiPC'97), IEEE Computer Society, 1997, pp. 578–587