Machine Learning Interatomic Potentials: Training Data Selection for Accuracy and Transferability
ORAL
Abstract
Machine learning (ML) techniques have enabled ML-based interatomic potentials that retain both the accuracy of first principles methods and the linear scaling and parallel efficiency of empirical potentials. Despite these advances, ML-based potentials often struggle to achieve transferability i.e., consistent accuracy across diverse configurations. This work demonstrates that in order to establish accurate, yet transferable potentials, a systematic approach to training data selection is required. This work leverages entropy-maximization in atomic descriptors within an automated sampling scheme to establish a diverse training set for tungsten, which was then used to train numerous different neural network potentials and simplified SNAP potentials. When tested on both entropy-guided and also physics-guided hold-out data, all of the potentials showed similar and consistent performance despite their different characteristics and model forms. As a result, the predictions made across this characteristic pair of training sets shows that models trained on diverse configurations yield more accurate predictions than models trained on low-energy, domain-expertise selected configurations.
SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525 Sand no. SAND2021-13306 A
SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525 Sand no. SAND2021-13306 A
–
Presenters
-
David O Montes de Oca Zapiain
Sandia National laboratories
Authors
-
David O Montes de Oca Zapiain
Sandia National laboratories
-
Mitchell A Wood
Sandia National Laboratories
-
Danny Perez
Los Alamos Natl Lab
-
Carlos Pereyra
University of California Davis
-
Nick Lubbers
Los Alamos National Laboratory, Los Alamos Natl Lab
-
Aidan P Thompson
Sandia National Laboratories