QMCPACK’s Exascale Performance Portability Strategies
ORAL
Abstract
The upcoming Exascale era encompasses multiple accelerator technologies from different vendors. This poses both performance and portability challenge for science applications. Here we outline the strategies newly adopted by the open-source quantum monte Carlo code QMCPACK (https://qmcpack.org). The implemented algorithms provide very high accuracy for atoms, molecules and solids, including both metallic and insulating phases. We are targeting high performance on Intel, AMD and NVIDIA GPUs, continued high-performance on multicore CPUs, and a higher performance than the existing CUDA implementation. For real-space QMC algorithms we have adopted a new design and parallelization strategy in the Monte Carlo to increase the numerical work that can be exposed to the GPUs and to allow for the increase of asynchronous operations. OpenMP target offload is used to execute on the GPUs, with vendor libraries used where possible. We show current performance for materials calculations with a broad range of electron counts and analyze the remaining inefficiencies.
–
Presenters
-
Paul Kent
Oak Ridge National Lab, Oak Ridge National Laboratory
Authors
-
Paul Kent
Oak Ridge National Lab, Oak Ridge National Laboratory
-
Peter Doak
Oak Ridge National Lab
-
Mark Dewing
Argonne National Laboratory
-
Ye Luo
Argonne National Laboratory