APS Logo

Machine Learning Interatomic Potentials: Training Data Selection for Accuracy and Transferability

ORAL

Abstract

Machine learning (ML) techniques have enabled ML-based interatomic potentials that retain both the accuracy of first principles methods and the linear scaling and parallel efficiency of empirical potentials. Despite these advances, ML-based potentials often struggle to achieve transferability i.e., consistent accuracy across diverse configurations. This work demonstrates that in order to establish accurate, yet transferable potentials, a systematic approach to training data selection is required. This work leverages entropy-maximization in atomic descriptors within an automated sampling scheme to establish a diverse training set for tungsten, which was then used to train numerous different neural network potentials and simplified SNAP potentials. When tested on both entropy-guided and also physics-guided hold-out data, all of the potentials showed similar and consistent performance despite their different characteristics and model forms. As a result, the predictions made across this characteristic pair of training sets shows that models trained on diverse configurations yield more accurate predictions than models trained on low-energy, domain-expertise selected configurations.

SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525 Sand no. SAND2021-13306 A

Presenters

  • David O Montes de Oca Zapiain

    Sandia National laboratories

Authors

  • David O Montes de Oca Zapiain

    Sandia National laboratories

  • Mitchell A Wood

    Sandia National Laboratories

  • Danny Perez

    Los Alamos Natl Lab

  • Carlos Pereyra

    University of California Davis

  • Nick Lubbers

    Los Alamos National Laboratory, Los Alamos Natl Lab

  • Aidan P Thompson

    Sandia National Laboratories