Optimization of GEM using OpenMP GPU Offloading

Qiheng Cai; Junyi Cheng; Yang Chen; Oryspayev Dossay; Paul Lin; D'azevedo Ed; Scott E Parker

Optimization of GEM using OpenMP GPU Offloading

POSTER

Abstract

GEM is a particle-in-cell gyrokinetic code for investigation of low-frequency phenomena such as micro-turbulence and energetic particle driven Alfven waves in tokamak plasma. In the GEM code, the particle arrays should be moved from CPU memory to GPU memory with all particle loops performed in GPUs. Moreover, the particle shift consists of two steps: the initialization step and actual data movement, in which the first step should be modified to enable more loops to run on GPU while the actual transferred particles in the second one should be updated between CPU and GPU. In order to minimize the data transfer and offload the calculation processes from CPU to GPU, the porting of GEM code from OpenACC to OpenMP GPU offload is implemented. The details of conversion from OpenACC to OpenMP GPU offloading are illustrated. Furthermore, we make the comparison of acceleration performance between CPU and GPU as well as OpenACC and OpenMP. Additionally, the results about comparison of profiling data between weak scaling (fixed grid size and increased particle number) and strong scaling (increased both grid and particle number in proportion) for different machines (e.g., Summit and Cori) are presented and discussed.

Publication: NA

Presenters

Qiheng Cai

University of Colorado, Boulder

Authors

Qiheng Cai

University of Colorado, Boulder
Junyi Cheng

University of Colorado, Boulder
Yang Chen

University of Colorado, Boulder
Oryspayev Dossay

Brookhaven National Laboratory
Paul Lin

Lawrence Berkley National Laboratory
D'azevedo Ed

Oak Ridge National Laboratory
Scott E Parker

University of Colorado, Boulder