An open dataset from the Large Plasma Device for machine learning and profile prediction

POSTER

Abstract

An open, machine learning (ML)-oriented dataset has been created using the flexible configurability and high repetition rate of the Large Plasma Device (LAPD). Often, plasma-ML work uses closed tokamak datasets that are biased towards the science goals at a particular time. This dataset attempts to minimize that bias and create a balanced dataset via latin-hypercube sampling of machine configuration space. Over 100,000 shots were collected spanning over 30 different LAPD configurations by varying the axial magnetic field profile, gas puff flow, gas puff duration, and discharge voltage. Spatial Langmuir probe data were collected for local density, potential, and electron temperature measurements at several axial locations. Additional diagnostics were also collected, such as Thomson scattering, interferometers, spectrometers, a diamagnetic loop, visible light diodes, and a fast framing camera. Time-averaged signals and -series of ion saturation current (isat) can be accurately estimated using a neural network which performs dramatically better than a linear-like baseline. This model is evaluated immediately after a shot is taken to assist with real-time machine operation. Results from input ablation studies and predication via generative modeling will also be presented. The dataset developed for this study is available and open, which will be useful for pedagogy and testing plasma-oriented ML models. Current work suggests many promising avenues for ML modeling and inference using this dataset.

Presenters

  • Phil Travis

    University of California, Los Angeles

Authors

  • Phil Travis

    University of California, Los Angeles

  • Troy A Carter

    University of California, Los Angeles