Deep Learning Architecture for Surrogate Models of Multi-Scale Plasma Systems
ORAL
Abstract
We present a novel and efficient convolutional neural network (CNN) architecture designed to model the long-term evolution of complex, multi-scale plasma systems. Our approach leverages the principle of scale separation, based on the observation that dynamically emerging plasma structures predominantly interact with nearby features of similar scale. This locality assumption enables significant computational savings by reducing the need to model long-range interactions between distant small-scale features—yielding substantial speed-ups over transformer-based models, which scale quadratically with system size.
We evaluate the proposed architecture on a benchmark plasma turbulence problem: the Hasegawa-Wakatani model, which describes drift-wave turbulence in magnetically confined fusion plasmas with a density gradient transverse to a uniform external magnetic field. Training data were generated using the BOUT++ framework. The simulations we executed on a high-performance computing cluster Perlmutter using 8 A100 GPUs. The data generation required approximately three hours.
We compare our model’s predictive performance against several variations of a conventional ResNet-based architecture. Our model achieves a multi-fold improvement in long-term prediction accuracy for key statistical metrics, including the spatial and temporal Fourier spectra of plasma density and electric potential, as well as their temporal autocorrelations. Notably, once trained—which takes only minutes on a single A100 GPU—the model can generate accurate predictions in seconds, offering a promising path toward fast, data-driven modeling of multi-scale plasma dynamics.
We evaluate the proposed architecture on a benchmark plasma turbulence problem: the Hasegawa-Wakatani model, which describes drift-wave turbulence in magnetically confined fusion plasmas with a density gradient transverse to a uniform external magnetic field. Training data were generated using the BOUT++ framework. The simulations we executed on a high-performance computing cluster Perlmutter using 8 A100 GPUs. The data generation required approximately three hours.
We compare our model’s predictive performance against several variations of a conventional ResNet-based architecture. Our model achieves a multi-fold improvement in long-term prediction accuracy for key statistical metrics, including the spatial and temporal Fourier spectra of plasma density and electric potential, as well as their temporal autocorrelations. Notably, once trained—which takes only minutes on a single A100 GPU—the model can generate accurate predictions in seconds, offering a promising path toward fast, data-driven modeling of multi-scale plasma dynamics.
–
Presenters
-
Alexander Khrabry
Princeton University
Authors
-
Alexander Khrabry
Princeton University
-
Edward A Startsev
Princeton Plasma Physics Laboratory (PPPL)
-
Andrew Tasman Powis
Princeton Plasma Physics Laboratory, Princeton, USA
-
Igor D Kaganovich
Princeton Plasma Physics Laboratory (PPPL)