APS Logo

Near real-time streaming analysis of big fusion data

POSTER

Abstract

Fusion plasma diagnostics, such as electron-cyclotron emission imaging

(ECEI) diagnostics, routinely generate fast,

high-dimensional data-streams, typically of the order of Gigabytes per

second. Future devices, like ITER, are predicted to generate

multiple petabytes of measurement data per day. Such large datasets

can not be analyzed manually. Furthermore, interested

parties in the analysis results are scattered around the globe. To

address these issues, we are developing the Delta

(aDaptive nEar-raL Time Analysis framework) - a python framework that

allows to stream measurement data to a remote

compute center, perform data analysis using distributed compute

resources, and display visualizations of the analyzed

data on a web-based dashboard. In this contribution we demonstrate the use-case where we stream ECEi

measurements taken at the KSTAR tokamak in Korea

to the NERSC compute center in California. Using Delta, we achieve a

bandwidth of over 500 MB/seconds and perform

a turbulence analysis of the entire dataset in under 5 minutes. The

analyzed data can be presented in near real-time on a

web-based dashboard. Finally, we discuss how machine learning-based

classifiers can be used in Delta to automatically target data

analysis routines to relevant subsets of the data stream.

Presenters

  • Ralph Kube

    Princeton Plasma Physics Laboratory, PPPL

Authors

  • Ralph Kube

    Princeton Plasma Physics Laboratory, PPPL

  • Michael Churchill

    Princeton Plasma Physics Laboratory

  • Jong Choi

    Oak Ridge National Laboratory

  • Jason Wang

    Oak Ridge National Laboratory

  • Laurie Stephey

    Lawrence Berkeley National Laboratory

  • Choongseok Chang

    Princeton Plasma Physics Laboratory, Princeton Plasma Physics Laboratory, Princeton University

  • Scott Klasky

    Oak Ridge National Laboratory