DFT-45B---a fertile soil (data) for your seeds (machine learning algorithms)
ORAL
Abstract
Machine-learning (ML) models have become the new paradigm in computational materials science for predicting properties of materials with the accuracy of quantum mechanics at a fraction of the cost. Accurate data (fertile soil) is crucial and helps us to build better ML models (healthier plants) using any ML algorithms (seeds). The inconsistencies in the materials data extracted from existing material repositories---less than a few hundred calculations for each alloy system, varied sizes of prototypes, and varying k-point density for different cell sizes---make it challenging to develop effective ML models. We created a DFT-based materials dataset (DFT-45B) consisting of 45 binary alloys (all binary combinations of 10 different elements---Ag, Al, Co, Cu, Fe, Mg, Nb, Ni, Ti, and V) with over 71775 calculations free of such inconsistencies. Each alloy system includes all possible enumerated crystal structures until 8 atoms for fcc, bcc and hcp crystal types. As the data encompasses the space of 10 elements and all their binary combinations, it is helpful to understand the similarity between various elements and alloys. In this talk, we present the methodology and heuristics of the dataset.
–
Presenters
-
Chandramouli Nyshadham
Kebotix, Inc., Cambridge, MA 02139, USA., Brigham Young Univ - Provo
Authors
-
Chandramouli Nyshadham
Kebotix, Inc., Cambridge, MA 02139, USA., Brigham Young Univ - Provo
-
Christoph Kreisbeck
Kebotix, Inc., Cambridge, MA 02139, USA.
-
Gus Hart
Brigham Young Univ - Provo, Physics and Astronomy, Brigham Young University, Department of Physics and Astronomy, Brigham Young University