Scenario Optimization for NSTX-U via Reinforcement Learning
POSTER
Abstract
Reliably achieving desired operating regimes using available actuators is crucial for nuclear fusion to become a viable energy source. In this work, a Reinforcement Learning (RL) agent is trained on the plasma simulation code COTSIM (Control Oriented Transport SIMulator) to determine optimal actuator trajectories for reaching target plasma regimes. These regimes are characterized by high normalized beta and a significant fraction of noninductive current drive, aligning with the core operational objectives of NSTX-U. During training, the target is systematically varied, enabling a single agent to reach multiple unique targets within a given parameter space. This allows for fast optimizations between experimental shots that can adapt to varying circumstances. To discourage the agent from exploring regimes associated with plasma instabilities, nonlinear constraints are incorporated within the reward function via penalty terms. Additionally, an actuator mask is used to accommodate optimizations of different dimensionality and to model potential actuator failures. Multiple RL architectures are evaluated for their effectiveness, flexibility, and speed, and are compared with several other gradient-based and non-gradient based optimization methods.
Presenters
-
Brian Robert Leard
Lehigh University
Authors
-
Brian Robert Leard
Lehigh University
-
Sai Tej Paruchuri
Lehigh University
-
Tariq Rafiq
Lehigh University
-
Eugenio Schuster
Lehigh University