Optimizing Feature Space for Small or Lower-Quality Data: A Case-Study in Charge Carrier Mobility
ORAL
Abstract
Artificial intelligence (AI) creates models that can accelerate the discovery of functional materials [1]. An open question is selecting the relevant materials features (descriptive parameters that characterize the material, that should be used to represent the material's function of interest, especially when there is a paucity of good-quality data. Here we present a method that uses feature importance metrics such as the SHAP values [2, 3], to select an optimal set of input features for a given problem. We then use this procedure to train better models for electron mobility, using a dataset of 64 materials, with experimentally determined electron mobilities and 23 computationally generated inputs. From here we find a subset of four features that generate best model across multiple regression techniques. The final set of models is then analyzed to find the regions of material space where high electron mobilities are expected.
[1] S. Bauer, et al. Modelling Simul. Mater. Sci. Eng. 32 063301 (2024)
[2] K. Aas, M. Jullum, and A. Løland. Artif. Intell. 298, 103502 (2021)
[3] T. A. R. Purcell, M. Scheffler, L. M. Ghiringhelli, C. Carbogno npj Comput. Mater. 9, 112 (2023)
[1] S. Bauer, et al. Modelling Simul. Mater. Sci. Eng. 32 063301 (2024)
[2] K. Aas, M. Jullum, and A. Løland. Artif. Intell. 298, 103502 (2021)
[3] T. A. R. Purcell, M. Scheffler, L. M. Ghiringhelli, C. Carbogno npj Comput. Mater. 9, 112 (2023)
–
Presenters
-
Thomas A R Purcell
University of Arizona
Authors
-
Thomas A R Purcell
University of Arizona
-
Yi Yao
The NOMAD Laboratory at the FHI of the MPS and MS1P e.V. Berlin
-
Raushan Anjum
University of Arizona
-
Matthias Scheffler
The NOMAD Laboratory at FHI, Max Planck Society