APS Logo

Optimizing Feature Space for Small or Lower-Quality Data: A Case-Study in Charge Carrier Mobility

ORAL

Abstract

Artificial intelligence (AI) creates models that can accelerate the discovery of functional materials [1]. An open question is selecting the relevant materials features (descriptive parameters that characterize the material, that should be used to represent the material's function of interest, especially when there is a paucity of good-quality data. Here we present a method that uses feature importance metrics such as the SHAP values [2, 3], to select an optimal set of input features for a given problem. We then use this procedure to train better models for electron mobility, using a dataset of 64 materials, with experimentally determined electron mobilities and 23 computationally generated inputs. From here we find a subset of four features that generate best model across multiple regression techniques. The final set of models is then analyzed to find the regions of material space where high electron mobilities are expected.

[1] S. Bauer, et al. Modelling Simul. Mater. Sci. Eng. 32 063301 (2024)

[2] K. Aas, M. Jullum, and A. Løland. Artif. Intell. 298, 103502 (2021)

[3] T. A. R. Purcell, M. Scheffler, L. M. Ghiringhelli, C. Carbogno npj Comput. Mater. 9, 112 (2023)

Presenters

  • Thomas A R Purcell

    University of Arizona

Authors

  • Thomas A R Purcell

    University of Arizona

  • Yi Yao

    The NOMAD Laboratory at the FHI of the MPS and MS1P e.V. Berlin

  • Raushan Anjum

    University of Arizona

  • Matthias Scheffler

    The NOMAD Laboratory at FHI, Max Planck Society