APS Logo

Interpretable Machine Learning for Materials Design

ORAL

Abstract

Fueled by the widespread adoption of Machine Learning and the high-throughput screening of materials, the data-driven approach to materials design has asserted itself as a robust and powerful tool for the in-silico prediction of materials properties. Researchers often face a difficult choice between a model’s interpretability or its performance. We study this trade-off by leveraging four different state-of-the-art Machine Learning techniques: XGBoost, SISSO, Roost, and TPOT for the prediction of structural and electronic properties of perovskites and 2D materials. We identify key problems to address to continue down a path towards automation. Finally, we offer several possible solutions to these challenges with a focus on retaining interpretability and share our thoughts on magnifying the impact of Machine Learning on materials design.

[1] Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System.Proceedings of the the22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages785–794, August 2016. DOI: 10.1145/2939672.2939785.

[2] Runhai Ouyang, Stefano Curtarolo, Emre Ahmetcik, Matthias Scheffler, and Luca M. Ghiringhelli.SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates.Physical Review Materials, 2(8):083802, August 2018. ISSN2475-9953. DOI: 10.1103/PhysRevMaterials.2.083802.

[3] Rhys E. A. Goodall and Alpha A. Lee. Predicting materials properties without crystal structure: Deep representation learning from stoichiometry.Nature Communications, 11(1):6280, December2020. ISSN 2041-1723. DOI: 10.1038/s41467-020-19964-7.

[4] Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016, p. 485–492, New York, NY, USA, July 2016. ISBN 978-1-4503-4206-3. DOI: 10.1145/2908812.2908918

Presenters

  • Timur Bazhirov

    Exabyte Inc.

Authors

  • Timur Bazhirov

    Exabyte Inc.

  • James Dean

    Exabyte Inc.

  • Rahul Bhowmik

    Polaron Analytics

  • Sergey Barabash

    Intermolecular, Inc.

  • Matthias Scheffler

    NOMAD Laboratory, Fritz Haber Institute of the Max Planck Society, Fritz-Haber Institute, The NOMAD Laboratory at the Fritz Haber Institute of the MPG

  • Thomas A Purcell

    Fritz-Haber-Institute, Fritz-Haber Institute, The NOMAD Laboratory at the Fritz Haber Institute of the MPG