APS Logo

Uncertainty Quantification of SISSO-based Symbolic Regression and Application to the Discovery of Oxides for Water Splitting

ORAL

Abstract

Symbolic regression (SR) relies on less data and avoids the need for a pre-assumed representation which makes it a suitable method for explainable machine learning in materials science. To effectively integrate SR models into materials discovery workflows, it is essential to quantify the reliability of their predictions and systematically explore promising regions within the materials space. In this work, we present an ensemble framework to estimate the uncertainties of SR models based on the sure-independence screening and sparsifying operator (SISSO)

approach. SISSO generates analytical expressions (descriptors) for target properties using moderately sized datasets. We apply uncertainty quantification (UQ) methods such as resampling, feature bagging, and varying model complexity and assess their performance on prediction errors and miscalibration scores, particularly when new data has a different property distribution. By leveraging the best UQ strategy and SISSO-derived descriptors, we can reduce the risk of overlooking potentially interesting portions of the materials space that were disregarded

in the initial training data. The efficacy of this SISSO-guided workflow is demonstrated by identifying acid-stable oxides for the water-splitting reaction through DFT-HSE06 calculations.

Presenters

  • AKHIL S NAIR

    The NOMAD Laboratory at FHI, Max Planck Society

Authors

  • AKHIL S NAIR

    The NOMAD Laboratory at FHI, Max Planck Society

  • Lucas Foppa

    Fritz Haber Institute of the Max Planck Society, The NOMAD Laboratory at FHI, Max Planck Society

  • Matthias Scheffler

    The NOMAD Laboratory at FHI, Max Planck Society