GPU-acceleration of stiffness matrix calculation and efficient initialization of EFG meshless methods

A. Karatarakis1, P. Metsis1, M. Papadrakakis1
1Institute of Structural Analysis and Antiseismic Research, National Technical University of Athens, Zografou Campus, Athens 15780, Greece

Tài liệu tham khảo

Li, 2002, Meshfree and particle methods and their applications, Appl. Mech. Rev., 55, 1, 10.1115/1.1431547 Nguyen, 2008, Meshless methods: A review and computer implementation aspects, Math. Comput. Simul., 79, 763, 10.1016/j.matcom.2008.01.003 Belytschko, 1996, Meshless methods: An overview and recent developments, Comput. Methods Appl. Mech. Engrg., 139, 3, 10.1016/S0045-7825(96)01078-X Danielson, 2000, Parallel computation of meshless methods for explicit dynamic analysis, Int. J. Numer. Methods Engrg., 47, 1323, 10.1002/(SICI)1097-0207(20000310)47:7<1323::AID-NME827>3.0.CO;2-0 Danielson, 2000, Large-scale application of some modern CSM methodologies by parallel computation, Adv. Engrg. Software, 31, 501, 10.1016/S0965-9978(00)00033-8 Liu, 2007, A smoothed finite element method for mechanics problems, Comput. Mech., 39, 859, 10.1007/s00466-006-0075-4 Wang, 2002, A point interpolation meshless method based on radial basis functions, Int. J. Numer. Methods Engrg., 54, 1623, 10.1002/nme.489 Gu, 2001, A coupled element free Galerkin/boundary element method for stress analysis of tow-dimensional solids, Comput. Methods Appl. Mech. Engrg., 190, 4405, 10.1016/S0045-7825(00)00324-8 Yuan, 2006, High performance sparse solver for unsymmetrical linear equations with out-of-core strategies and its application on meshless methods, Appl. Math. Mech. (Engl. Ed.), 27, 1339, 10.1007/s10483-006-1006-1 Wu, 2008, A high performance large sparse symmetric solver for the meshfree Galerkin method, Int. J. Comput. Methods, 5, 533, 10.1142/S0219876208001613 Divo, 2006, Iterative domain decomposition meshless method modeling of incompressible viscous flows and conjugate heat transfer, Engrg. Anal. Bound. Elem., 30, 465, 10.1016/j.enganabound.2006.02.002 Metsis, 2012, Overlapping and non-overlapping domain decomposition methods for large-scale meshless EFG simulations, Comput. Methods Appl. Mech. Engrg., 229–232, 128, 10.1016/j.cma.2012.03.012 Sanders, 2010 Kirk, 2010 NVIDIA Corporation, CUDA C Best Practices Guide, NVIDIA GPU Computing Documentation ∣ NVIDIA Developer Zone, NVIDIA, 2012. TOP500 Supercomputing Sites. Available: <http://www.top500.org/>. Kampolis, 2010, CFD-based analysis and two-level aerodynamic optimization on graphics processing units, Comput. Methods Appl. Mech. Engrg., 199, 712, 10.1016/j.cma.2009.11.001 Elsen, 2008, Large calculation of the flow over a hypersonic vehicle using a GPU, J. Comput. Phys., 227, 10148, 10.1016/j.jcp.2008.08.023 Thibault, 2012, Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms, J. Supercomput., 59, 693, 10.1007/s11227-010-0468-1 De La Asunción, 2011, Simulation of one-layer shallow water systems on multicore and CUDA architectures, J. Supercomput., 58, 206, 10.1007/s11227-010-0406-2 Zhou, 2012, GPU implementation of lattice Boltzmann method for flows with curved boundaries, Comput. Methods Appl. Mech. Engrg., 225–228, 65, 10.1016/j.cma.2012.03.011 Sunarso, 2010, GPU-accelerated molecular dynamics simulation for study of liquid crystalline flows, J. Comput. Phys., 229, 5486, 10.1016/j.jcp.2010.03.047 Anderson, 2008, General purpose molecular dynamics simulations fully implemented on graphics processing units, J. Comput. Phys., 227, 5342, 10.1016/j.jcp.2008.01.047 Wadbro, 2009, Megapixel topology optimization on a graphics processing unit, SIAM Rev., 51, 707, 10.1137/070699822 Komatitsch, 2010, High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster, J. Comput. Phys., 229, 7692, 10.1016/j.jcp.2010.06.024 Takahashi, 2009, GPU-accelerated boundary element method for Helmholtz’ equation in three dimensions, Int. J. Numer. Methods Engrg., 80, 1295, 10.1002/nme.2661 Joldes, 2010, Real-time nonlinear finite element computations on GPU-Application to neurosurgical simulation, Comput. Methods Appl. Mech. Engrg., 199, 3305, 10.1016/j.cma.2010.06.037 Tomov, 2010, Towards dense linear algebra for hybrid GPU accelerated manycore systems, Parallel Comput., 36, 232, 10.1016/j.parco.2009.12.005 Schenk, 2008, Algorithmic performance studies on graphics processing units, J. Parallel Distrib. Comput., 68, 1360, 10.1016/j.jpdc.2008.05.008 Elble, 2010, GPU computing with Kaczmarz’s and other iterative algorithms for linear systems, Parallel Comput., 36, 215, 10.1016/j.parco.2009.12.003 A. Cevahir, A. Nukada, S. Matsuoka, Fast conjugate gradients with multiple GPUs, in: 9th International Conference on Computational Science, ICCS 2009, Baton Rouge, LA, 2009, pp. 893–903. Cevahir, 2010, High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning, Comput. Sci.-Res. Develop., 25, 83, 10.1007/s00450-010-0112-6 Papadrakakis, 2011, A new era in scientific computing: Domain decomposition methods in hybrid CPU–GPU architectures, Comput. Methods Appl. Mech. Engrg., 200, 1490, 10.1016/j.cma.2011.01.013 Trobec, 2009, Computational complexity and parallelization of the meshless local Petrov–Galerkin method, Comput. Struct., 87, 81, 10.1016/j.compstruc.2008.08.003 C. Felippa, Chapter 15-Solid Elements: Overview, Advanced Finite Element Methods (ASEN 6367) Course Material, University of Colorado, 2011. Sparse matrix: Dictionary of keys (DOK), Wikipedia, the free encyclopedia, Sep 2012. Hash table, Wikipedia, the free encyclopedia, Sep 2012. Sparse matrix: Coordinate list (COO), Wikipedia, the free encyclopedia, Sep 2012. W.W. Hwu, D.B. Kirk, Parallelism Scalability, Programming and tUning Massively Parallel Systems (PUMPS), Barcelona, 2011.