Low occupancy high performance elemental products in assembly free FEM on GPU

Engineering with Computers - Tập 38 - Trang 2189-2204 - 2021
Nileshchandra K. Pikle1, Shailesh R. Sathe2, Arvind Y. Vyavahare3
1School of Computer Science and Engineering, Vellore Institute of Technology, Amravati, India
2Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
3Department of Applied Mechanics, Visvesvaraya National Institute of Technology, Nagpur, India

Tóm tắt

Assembly free FEM bypasses the assembly step and solves the system of linear equations at the element level using Conjugate Gradient (CG) type iterative solver. The smaller dense Matrix-vector Products (MvPs) are encapsulated within the CG solver and are computed either at element level or degree of freedom (DoF) level. Both these strategies exploit the computing power of GPU effectively, but the performance is lagging due to the uncoalesced global memory access on GPU. This paper proposes an improved MvP strategy in assembly free FEM, which improves the performance by coalesced global memory access using on-chip faster shared memory and using the texture cache memory on GPU. Since GPU has limited shared memory (in few KBs), the proposed technique suffers from a problem known as low occupancy. Despite the low occupancy issue, the proposed strategy outperforms both element based and DoF based MvP strategies on GPU. Numerical experiments compared with element level and DoF level strategies on GPU and found that, GPU instance of proposed MvP outperforms both strategies approximately by factor of 7 and 1.5 respectively.

Tài liệu tham khảo