Uncertainty Quantification of SISSO-based Symbolic Regression and Application to the Discovery of Oxides for Water Splitting
ORAL
Abstract
Symbolic regression (SR) relies on less data and avoids the need for a pre-assumed representation which makes it a suitable method for explainable machine learning in materials science. To effectively integrate SR models into materials discovery workflows, it is essential to quantify the reliability of their predictions and systematically explore promising regions within the materials space. In this work, we present an ensemble framework to estimate the uncertainties of SR models based on the sure-independence screening and sparsifying operator (SISSO)
approach. SISSO generates analytical expressions (descriptors) for target properties using moderately sized datasets. We apply uncertainty quantification (UQ) methods such as resampling, feature bagging, and varying model complexity and assess their performance on prediction errors and miscalibration scores, particularly when new data has a different property distribution. By leveraging the best UQ strategy and SISSO-derived descriptors, we can reduce the risk of overlooking potentially interesting portions of the materials space that were disregarded
in the initial training data. The efficacy of this SISSO-guided workflow is demonstrated by identifying acid-stable oxides for the water-splitting reaction through DFT-HSE06 calculations.
approach. SISSO generates analytical expressions (descriptors) for target properties using moderately sized datasets. We apply uncertainty quantification (UQ) methods such as resampling, feature bagging, and varying model complexity and assess their performance on prediction errors and miscalibration scores, particularly when new data has a different property distribution. By leveraging the best UQ strategy and SISSO-derived descriptors, we can reduce the risk of overlooking potentially interesting portions of the materials space that were disregarded
in the initial training data. The efficacy of this SISSO-guided workflow is demonstrated by identifying acid-stable oxides for the water-splitting reaction through DFT-HSE06 calculations.
–
Presenters
-
AKHIL S NAIR
The NOMAD Laboratory at FHI, Max Planck Society
Authors
-
AKHIL S NAIR
The NOMAD Laboratory at FHI, Max Planck Society
-
Lucas Foppa
Fritz Haber Institute of the Max Planck Society, The NOMAD Laboratory at FHI, Max Planck Society
-
Matthias Scheffler
The NOMAD Laboratory at FHI, Max Planck Society