Turbulence Simulation using many Graphics Processors

ORAL

Abstract

Unsteady simulations of turbulence are performed using up to 64 graphics processors on the NSF XSede supercomputer, Lincoln, located at NCSA. For a $512^3$ simulations the performance of 16 GPUs (Tesla S1070) is about 45 times faster than that obtained with the same number of CPU cores of quad-core Intel Harpertown processors on the same machine. The code is optimized to use the fast shared-memory on the GPUs and to use communication/computation overlapping. Results show that the computation time is now so fast that even for large problems, with up to 8 million unknowns per GPU, the MPI communication time controls the scaling behavior of the CFD algorithm.

Authors

  • Ali Khajeh-Saeed

    University of Massachusetts Amherst, Mechanical and Industrial Engineering, Amherst, MA 01003, United States, Aerospace Engineering Department, Sharif University of Technology, Tehran, Iran

  • J. Blair Perot

    University of Massachusetts Amherst, Mechanical and Industrial Engineering, Amherst, MA 01003, United States, University of Massachusetts, Amherst, University of Massachusetts Amherst