Turbulence Simulation using many Graphics Processors
ORAL
Abstract
Unsteady simulations of turbulence are performed using up to 64 graphics processors on the NSF XSede supercomputer, Lincoln, located at NCSA. For a $512^3$ simulations the performance of 16 GPUs (Tesla S1070) is about 45 times faster than that obtained with the same number of CPU cores of quad-core Intel Harpertown processors on the same machine. The code is optimized to use the fast shared-memory on the GPUs and to use communication/computation overlapping. Results show that the computation time is now so fast that even for large problems, with up to 8 million unknowns per GPU, the MPI communication time controls the scaling behavior of the CFD algorithm.
–
Authors
-
Ali Khajeh-Saeed
University of Massachusetts Amherst, Mechanical and Industrial Engineering, Amherst, MA 01003, United States, Aerospace Engineering Department, Sharif University of Technology, Tehran, Iran
-
J. Blair Perot
University of Massachusetts Amherst, Mechanical and Industrial Engineering, Amherst, MA 01003, United States, University of Massachusetts, Amherst, University of Massachusetts Amherst