APS Logo

One Facility's Perspective on Data for Machine Learning

ORAL · Invited

Abstract

In the era of machine learning (ML), the quality and accessibility of data play a pivotal role in the success of algorithms and models. In this talk, we will explore DMSPs role in producing AI ready data through the lens of experiments at the Continuous Electron Beam Facility

Experimental Nuclear Physics Facilities generate data essential for addressing major scientific questions. Applying AI/ML techniques has led to increased efficiencies in experiment and accelerator operations and improved reconstruction algorithms.

However, research results often face challenges with reproducibility and lack of generalization. Ensuring the interoperability of data, systems, and policies is essential to overcoming these obstacles through a partnership between the experimental facility and the user community. This shared understanding can be expressed in data management and sharing plans (DMSPs).

DMSPs are vital in establishing a solid framework for data governance. They aid in organizing data lifecycles from collection to archiving, ensuring that datasets are well-documented, properly stored, and easily retrievable for future use.

Presenters

  • Amber S Boehnlein

    Jefferson Lab/Jefferson Science Associates

Authors

  • Amber S Boehnlein

    Jefferson Lab/Jefferson Science Associates