Comparison of VPIC Performance on Several Modern Architectures

POSTER

Abstract

VPIC [ K. J. Bowers, B. J. Albright, L. Yin, B. Bergen, and T. J. T. Kwan, Phys. Plasmas 15, 055703 (2008) ] is being ported and optimized on several modern architectures. These include KNL
processors available on Trinity, Cori and Stampede2, Skylake processors available on Mare Nostrum and Stampede2, IBM Power 9 processors and Volta GPUs available on Summit and Sierra and ARM ThunderX2 processors, soon to be available on Astra at Sandia. VPIC
is in production on several of these systems. These architectures vary in many ways including available memory bandwidth, vector length, threads per core, clock frequency and overall node architecture. This work is focused on single node performance. Current efforts to optimize single node performance are exploring changes to data layout of key data structures and performance profiling with a variety of performance analysis tools. Results will be presented
which compare the performance of VPIC on these different architectures.

Presenters

  • William D Nystrom

    Los Alamos National Laboratory, LANL, Los Alamos Natl Lab

Authors

  • William D Nystrom

    Los Alamos National Laboratory, LANL, Los Alamos Natl Lab

  • Robert F. Bird

    Los Alamos National Laboratory, LANL, Los Alamos Natl Lab

  • William S Daughton

    Los Alamos Natl Lab, Los Alamos National Laboratory, LANL

  • Fan Guo

    Los Alamos Natl Lab, Los Alamos National Laboratory

  • Ari Le

    Los Alamos Natl Lab, Los Alamos National Laboratory

  • Hui Li

    Los Alamos National Laboratory, Los Alamos Natl Lab, Los Alamos National Laboratory, Los Alamos National Laboratory

  • Adam J Stanier

    LANL, Los Alamos National Laboratory, Los Alamos Natl Lab

  • David J. Stark

    Los Alamos Natl Lab, Los Alamos National Laboratory

  • Lin Yin

    Los Alamos Natl Lab, Los Alamos National Laboratory

  • Brian James Albright

    Los Alamos Natl Lab, Los Alamos National Laboratory