APS Logo

Exploring novel algorithmic strategies for optimal performance of discontinuous Galerkin-based flow solvers on GPU-based systems

ORAL

Abstract

Heterogeneous architectures, particularly those based on graphics processsing units (GPUs), are becoming increasingly prevalent in high-performance computing. However, GPU programming is challenging, and requires a different approach to achieve optimal performance than traditional CPU-based implementations. In this talk, we present a case study of a computational fluid dynamics (CFD) application based on the high-order discontinuous Galerkin (DG) method. While the DG method is well-suited for parallelization, the surface term computations for the gradients and fluxes involved in the method present a significant performance bottleneck on GPUs due to the non-contiguous memory accesses. To address this issue, we explore novel performance strategies such as utilization of shared memory, intermediate memory coalescing, and use of alternate data layouts for the element nodes. The implementation and performance metrics for these strategies will be assessed across different GPU architectures, using the OCCA portability framework. The overall speedup of the application for different polynomial orders will also be evaluated and presented. The goal is to investigate novel approaches for optimal performance of DG-based CFD applications on GPUs, which will contribute to the advancement of scientific studies using CFD simulations on modern supercomputers.

Presenters

  • Umesh Unnikrishnan

    Argonne National Laboratory

Authors

  • Umesh Unnikrishnan

    Argonne National Laboratory

  • Kris Rowe

    Argonne National Laboratory

  • Saumil S Patel

    Argonne National Laboratory