APS Logo

GPU implementation of generalized Bloch wave calculation for the first-principles electron transport calculation.

ORAL

Abstract

In recent years, the demand for CPUs and GPUs as accelerators has been increasing due to the widespread use of applications utilizing machine learning. As a result, the number of supercomputer systems equipped with accelerators in the TOP 500 has been increasing every year. To utilize computational resources more efficiently, applications are expected to run on GPUs.

In this study, we port the generalized Bloch wave calculation, which is part of the first-principles electron-transport calculation code “RSPACE” that we are developing to run on GPU-equipped supercomputers efficiently.

In RSPACE, the calculation of transport properties is performed using the wave function matching method. In this calculation, one of the most computationally intensive parts is the computation of generalized Bloch waves for the electrodes.

When porting the programs, we optimize the loop length to be suitable for the number of GPU calculation cores and minimize data transfer to reduce communication time between CPU and GPU. Furthermore, to achieve efficient parallel processing, we implement dynamic task allocation using MPI shared memory and the compare-and-swap operation.

To evaluate the performance of the developed codes, we measure the runtimes on two types of GPU-equipped supercomputers using three models of carbon nanotubes (CNTs), which vary in diameter and computational load. As a demonstration of the developed code, we investigate the complex band structure of armchair CNT models.

Presenters

  • Takanori Akamatsu

    Kobe University

Authors

  • Takanori Akamatsu

    Kobe University

  • Mitsuharu Uemoto

    Kobe University

  • Yoshiyuki Egami

    Hokkaido University

  • Tomoya Ono

    Kobe University