GPU-acceleration of stiffness matrix calculation and efficient initialization of EFG meshless methods
Tài liệu tham khảo
Li, 2002, Meshfree and particle methods and their applications, Appl. Mech. Rev., 55, 1, 10.1115/1.1431547
Nguyen, 2008, Meshless methods: A review and computer implementation aspects, Math. Comput. Simul., 79, 763, 10.1016/j.matcom.2008.01.003
Belytschko, 1996, Meshless methods: An overview and recent developments, Comput. Methods Appl. Mech. Engrg., 139, 3, 10.1016/S0045-7825(96)01078-X
Danielson, 2000, Parallel computation of meshless methods for explicit dynamic analysis, Int. J. Numer. Methods Engrg., 47, 1323, 10.1002/(SICI)1097-0207(20000310)47:7<1323::AID-NME827>3.0.CO;2-0
Danielson, 2000, Large-scale application of some modern CSM methodologies by parallel computation, Adv. Engrg. Software, 31, 501, 10.1016/S0965-9978(00)00033-8
Liu, 2007, A smoothed finite element method for mechanics problems, Comput. Mech., 39, 859, 10.1007/s00466-006-0075-4
Wang, 2002, A point interpolation meshless method based on radial basis functions, Int. J. Numer. Methods Engrg., 54, 1623, 10.1002/nme.489
Gu, 2001, A coupled element free Galerkin/boundary element method for stress analysis of tow-dimensional solids, Comput. Methods Appl. Mech. Engrg., 190, 4405, 10.1016/S0045-7825(00)00324-8
Yuan, 2006, High performance sparse solver for unsymmetrical linear equations with out-of-core strategies and its application on meshless methods, Appl. Math. Mech. (Engl. Ed.), 27, 1339, 10.1007/s10483-006-1006-1
Wu, 2008, A high performance large sparse symmetric solver for the meshfree Galerkin method, Int. J. Comput. Methods, 5, 533, 10.1142/S0219876208001613
Divo, 2006, Iterative domain decomposition meshless method modeling of incompressible viscous flows and conjugate heat transfer, Engrg. Anal. Bound. Elem., 30, 465, 10.1016/j.enganabound.2006.02.002
Metsis, 2012, Overlapping and non-overlapping domain decomposition methods for large-scale meshless EFG simulations, Comput. Methods Appl. Mech. Engrg., 229–232, 128, 10.1016/j.cma.2012.03.012
Sanders, 2010
Kirk, 2010
NVIDIA Corporation, CUDA C Best Practices Guide, NVIDIA GPU Computing Documentation ∣ NVIDIA Developer Zone, NVIDIA, 2012.
TOP500 Supercomputing Sites. Available: <http://www.top500.org/>.
Kampolis, 2010, CFD-based analysis and two-level aerodynamic optimization on graphics processing units, Comput. Methods Appl. Mech. Engrg., 199, 712, 10.1016/j.cma.2009.11.001
Elsen, 2008, Large calculation of the flow over a hypersonic vehicle using a GPU, J. Comput. Phys., 227, 10148, 10.1016/j.jcp.2008.08.023
Thibault, 2012, Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms, J. Supercomput., 59, 693, 10.1007/s11227-010-0468-1
De La Asunción, 2011, Simulation of one-layer shallow water systems on multicore and CUDA architectures, J. Supercomput., 58, 206, 10.1007/s11227-010-0406-2
Zhou, 2012, GPU implementation of lattice Boltzmann method for flows with curved boundaries, Comput. Methods Appl. Mech. Engrg., 225–228, 65, 10.1016/j.cma.2012.03.011
Sunarso, 2010, GPU-accelerated molecular dynamics simulation for study of liquid crystalline flows, J. Comput. Phys., 229, 5486, 10.1016/j.jcp.2010.03.047
Anderson, 2008, General purpose molecular dynamics simulations fully implemented on graphics processing units, J. Comput. Phys., 227, 5342, 10.1016/j.jcp.2008.01.047
Wadbro, 2009, Megapixel topology optimization on a graphics processing unit, SIAM Rev., 51, 707, 10.1137/070699822
Komatitsch, 2010, High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster, J. Comput. Phys., 229, 7692, 10.1016/j.jcp.2010.06.024
Takahashi, 2009, GPU-accelerated boundary element method for Helmholtz’ equation in three dimensions, Int. J. Numer. Methods Engrg., 80, 1295, 10.1002/nme.2661
Joldes, 2010, Real-time nonlinear finite element computations on GPU-Application to neurosurgical simulation, Comput. Methods Appl. Mech. Engrg., 199, 3305, 10.1016/j.cma.2010.06.037
Tomov, 2010, Towards dense linear algebra for hybrid GPU accelerated manycore systems, Parallel Comput., 36, 232, 10.1016/j.parco.2009.12.005
Schenk, 2008, Algorithmic performance studies on graphics processing units, J. Parallel Distrib. Comput., 68, 1360, 10.1016/j.jpdc.2008.05.008
Elble, 2010, GPU computing with Kaczmarz’s and other iterative algorithms for linear systems, Parallel Comput., 36, 215, 10.1016/j.parco.2009.12.003
A. Cevahir, A. Nukada, S. Matsuoka, Fast conjugate gradients with multiple GPUs, in: 9th International Conference on Computational Science, ICCS 2009, Baton Rouge, LA, 2009, pp. 893–903.
Cevahir, 2010, High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning, Comput. Sci.-Res. Develop., 25, 83, 10.1007/s00450-010-0112-6
Papadrakakis, 2011, A new era in scientific computing: Domain decomposition methods in hybrid CPU–GPU architectures, Comput. Methods Appl. Mech. Engrg., 200, 1490, 10.1016/j.cma.2011.01.013
Trobec, 2009, Computational complexity and parallelization of the meshless local Petrov–Galerkin method, Comput. Struct., 87, 81, 10.1016/j.compstruc.2008.08.003
C. Felippa, Chapter 15-Solid Elements: Overview, Advanced Finite Element Methods (ASEN 6367) Course Material, University of Colorado, 2011.
Sparse matrix: Dictionary of keys (DOK), Wikipedia, the free encyclopedia, Sep 2012.
Hash table, Wikipedia, the free encyclopedia, Sep 2012.
Sparse matrix: Coordinate list (COO), Wikipedia, the free encyclopedia, Sep 2012.
W.W. Hwu, D.B. Kirk, Parallelism Scalability, Programming and tUning Massively Parallel Systems (PUMPS), Barcelona, 2011.