QMCPACK performance portability on NVIDIA and AMD GPUs
ORAL
Abstract
As Exascale supercomputers are being deployed in U.S., QMCPACK (https://qmcpack.org) developers have migrated the code base to a performance portable implementation for science production on these powerful machines with out-of-box experience. With a fresh design of code architecture, historically divergent code paths for CPUs and GPUs have been unified and a core set of features are available on all the computing platforms including CPUs and GPUs today. With portable OpenMP target offload programming model and high quality vendor linear algebra libraries, impressive performance has been achieved with minimal vendor specific customization needed. We show current performance for materials calculations on NVIDIA and AMD GPUs with a broad range of electron counts and analyze the remaining inefficiencies.
–
Presenters
-
Ye Luo
Argonne National Laboratory
Authors
-
Ye Luo
Argonne National Laboratory
-
Peter Doak
Oak Ridge National Laboratory
-
Paul Kent
Oak Ridge National Lab, Oak Ridge National Laboratory