Toward Petaflop First Principles Kinetic Plasma Simulation

COFFEE_KLATCH · Invited

Abstract

Due to physical limitations (such as the speed of light), moving data between and even within modern microprocessors is more time consuming than performing computations. As a result, individual processor core performance is stagnant, multicore processors are ubiquitous and traditional programming styles are unable to exploit the potential of modern computers fully. This talk will discuss the architecture and implementation of the 3d electromagnetic relativistic particle-in-cell code VPIC for LANL's Roadrunner supercomputer. Roadrunner is expected to have 13,000 IBM Cell microprocessors (each Cell contains a dual threaded Power core and 8 specialized vector cores) and be capable of over a petaflop ($10^{15}$ floating point operations per second). VPIC minimizes data movement and allows vector extensions of modern processors to be utilized portably. This made it possible to port VPIC quickly while achieving unprecedented performance. The initial port performed 0.13 billion particles pushed and accumulated per second per Cell---equivalent to 1.0 billion per second per 8 Cell node or sustaining Roadrunner at 0.4 petaflop. Higher performance is likely as the port is refined. Regardless, already demonstrated performance will enable previously intractable simulations in numerous areas of plasma physics, including magnetic reconnection and laser plasma interactions.

Authors

  • Kevin Bowers

    Plasma Theory and Applications (X-1-PTA), Los Alamos National Lab, MS B259, PO Box 1663, Los Alamos, NM 87545