APS Logo

Toward Intelligent Fusion Data Workflows: Harmonized Labeling with dFL and Multiscale Simulation Infrastructure via MGKDB

POSTER

Abstract

Progress in fusion energy science relies on turning voluminous, heterogeneous data into usable insight. As one pillar of the Fusion Data Platform (FDP) initiative, the Data Fusion Labeler (dFL) streamlines harmonization, preprocessing, and labeling across experimental diagnostics and multiscale simulations. To meet fusion‑specific challenges—noise, sparsity, imbalance, temporal skew, etc.---dFL offers adaptive smoothing, normalization, resampling, and schema‑aware I/O (TokSearch, CMF, IMAS/OMAS). Manual and automated interfaces, paired with built-in dimension-reduction and surrogate-model helpers, shorten the path from raw shots to AI/ML-ready datasets while preserving expert oversight.

Within the SMARTS project (Surrogate Models for Accurate and Rapid Transport Solutions), the complementary opensource Multiscale GyroKinetics DataBase (MGKDB) provides a schema-driven, metadata-rich repository for multi-resolution gyrokinetic outputs from GENE, CGYRO, TGLF, GX, GS2, and QuaLiKiz. A flexible MongoDB backend, shell and GUI clients, and IMAS hooks enable unified code metadata, provenance, benchmarking records, and evolving file formats. MGKDB thus furnishes durable infrastructure for cross-code comparison, validation, and downstream modeling, seamlessly linking simulation and experimental streams prepared by dFL.

Together, dFL and MGKDB advance a reproducible, extensible data ecosystem that accelerates scalable machine learning, multiscale physics integration, and simulation–experiment synergy for the fusion community.

Presenters

  • Craig Michoski

    SapientAI LLC

Authors

  • Craig Michoski

    SapientAI LLC

  • Mathew Waller

    Sophelio

  • Zeyu Li

    General Atomics

  • Brian Sammuli

    General Atomics

  • Raffi M Nazikian

    General Atomics

  • Sterling P Smith

    General Atomics

  • David Orozco

    General Atomics

  • Venkitesh Ayyar

    Sapientai

  • Mitchell Clark

    General Atomics

  • Michael Fredrickson

    Sophelio

  • David R Hatch

    University of Texas at Austin, IFS, University of Texas

  • Todd A. Oliver

    Oden Institute for Computational Engineering and Sciences

  • Martin Foltin

    Hewlett Packard Enterprise

  • Dongyang Kuang

    Sophelio

  • Christopher G Holland

    University of California, San Diego

  • Joseph T McClenaghan

    General Atomics

  • Joseph M Schmidt

    University of Texas at Austin

  • Bhavin S Patel

    United Kingdom Atomic Energy Authority, Culham Campus, Abingdon, OX14 3DB, UK, UK Atomic Energy Authority

  • Aaron Ho

    MIT, MIT PSFC, Massachusetts Institute of Technology