Hypothesis-driven active learning over the chemical space
ORAL
Abstract
From applications in identifying potential drug targets to designing electronics, catalysts, photovoltaics and chemical reactions, efforts to discover molecular candidates has risen steeply over the years. The rapid exploration of chemical space targeting desired functionalities is performed by high-throughput screening combined with computational simulations and synthesis. Here, we introduce a novel approach for active learning of a wide chemical space based on hypothesis learning. The study is conducted on ~130,000 molecules present in the QM9 dataset to actively learn about formation enthalpy of all molecules. We construct multiple hypotheses based on the possible relationships between structures and functionalities of interest and introduce these as mean functions for Gaussian Process. This approach then combines the elements from the symbolic regression methods such as SISSO and Bayesian Optimization in a single framework. Although demonstrated for the QM9 dataset, this method is expected to be universally applicable for other datasets containing information on molecules to solid-state materials.
–
Presenters
-
Ayana Ghosh
Oak Ridge National Lab
Authors
-
Ayana Ghosh
Oak Ridge National Lab
-
Sergei V Kalinin
University of Tennessee, University of Tennessee, Knoxville
-
Maxim Ziatdinov
Oak Ridge National Lab