Utilization of the MongoDB Repository for Information and Archiving (MORIA) framework to manage large data sets for machine-learning applications
POSTER
Abstract
As the high-energy-density (HED) physics community moves towards high-repetition-rate (HRR) operation in the ~0.01-10 Hz regime, a new paradigm of data management must be adopted [Feister, HPLSE 11 (2023]. Data acquisition from experimental subsystems, including the laser, targetry, and performance diagnostics, must be synchronized and archived in real time (~10-100MB/s, ~1-10 PB/yr). The database architecture should be flexible and expandable, depending on the application or experimental campaign and driven by the FAIR (Findable, Accessible, Interoperable, and Reusable) Guiding Principles [Wilkinson, Scientific Data 3 (2016)]. To this end, General Atomics (GA) has begun development of a NoSQL-database framework, the MongoDB Repository for Information and Archiving (MORIA). An organizational schema has been implemented that shifts scientific HED data organization from a shot-based to a diagnostic-based approach in order to increase archival and retrieval efficiency that lends itself to optimization applications. MORIA has been installed at the GALADRIEL facility and has demonstrated 1Hz archival of multiple laser and target diagnostics for thousands of shots. An overview of the database implementation will be given and results from a recently developed machine learning model for inferring and controlling the compressed pulse shape will be shown.
Presenters
-
Mario J Manuel
General Atomics - San Diego
Authors
-
Mario J Manuel
General Atomics - San Diego
-
Javier H Nicolau
University of California, San Diego, San Diego Supercomputer Center
-
Austin Keller
General Atomics
-
Gilbert W Collins
General Atomics - San Diego
-
Sean M Buczek
General Atomics; UC San Diego
-
Brian Sammuli
General Atomics
-
Raffi M Nazikian
General Atomics
-
Neil B Alexander
General Atomics - San Diego, General Atomics