Ameliorating synthesis and scarce data challenges through joint embedding for high energy molecule generation
ORAL
Abstract
Deep learning has shown a high potential for generating molecules with desired properties. But generative modeling can often lead to novel, speculative molecules whose synthesis routes are not obvious. Moreover, the cost and time required to calculate or measure high energy properties have restricted the available data set sizes for this class of materials, thereby limiting the usefulness of deep learning-based methods. As a solution to this problem, we propose a deep learning-based method that fuses data from multiple molecule classes, effectively enabling the learning and designing of high energy molecules with the assistance of data for general organic molecules, which tend to be available in massive databases. Low-level physicochemical information from both classes of molecules is embedded into a common latent space of an autoencoder using the joint embedding method. This has the effect of enriching the chemical information contained in the high-energy set with relatively low-energy molecule information. The joint embedding approach particularly leads to a differentiable latent representation which allows the chemical space around some of the high-energy molecules to be explored. Through local gradient ascent optimization, we generate molecules that are similar to the known high-energy molecules but have relatively better material properties. The similarity afforded by local optimization is expected to reduce the difficulty of synthesis planning. Validation is performed using an equilibrium thermochemistry code to verify the target properties of the generated molecules.
–
Publication: Balakrishnan, Sangeeth, Francis G. VanGessel, Zois Boukouvalas, Brian C. Barnes, Mark D. Fuge, and Peter W. Chung. "Locally Optimizable Joint Embedding Framework to Design Nitrogen-rich Molecules that are Similar but Improved." Molecular Informatics 40, no. 7 (2021): 2100011.
Presenters
SANGEETH BALAKRISHNAN
Department of Mechanical Engineering, University of Maryland, College Park, University of Maryland, College Park
Authors
SANGEETH BALAKRISHNAN
Department of Mechanical Engineering, University of Maryland, College Park, University of Maryland, College Park
Francis G VanGessel
U.S. Naval Surface Warfare Center, Indian Head Division, Indian Head, MD
Zois Boukouvalas
Department of Mathematics and Statistics, American University, Washington, DC, American University
Brian C Barnes
U.S. DEVCOM Army Research Laboratory, Aberdeen Proving Ground, MD, U.S. Army Combat Capabilities Development Command (DEVCOM) Army Research Laboratory, US Army Research Lab Aberdeen
Mark D Fuge
Department of Mechanical Engineering, University of Maryland, College Park, University of Maryland, College Park
Peter W Chung
University of Maryland, College Park, Department of Mechanical Engineering, University of Maryland, College Park