APS Logo

Optimization of Molecular Characteristic using Continuous Representation of Molecules by Variational Autoencoder with Discriminator

ORAL

Abstract

Efficient molecular search contributes to an essential speedup of the development of organic devices and, in turn to the improvement of their characteristics. In the present study, we focus on the deep learning variational auto-encoder (VAE) model[1], where molecules represented by SMILES strings can be efficiently converted to multivariable continuous space. The VAE consists of two neural networks: an encoder and a decoder. The one-hot representation of SMILES is input to the encoder and mapped to the latent variable space. Here we further improve the output rate of valid SMILES of decoder by introducing a discriminator attached to the VAE stream. Adopting a molecular-mechanics method to calculate 3D structure from SMILES, we can optimize physical properties of the molecule by other simulation methods such as density-functional-theory calculations even when there is not enough data set. The range of physical property space covered by the SMILES representation is thereby expanded and the data-driven optimization using Kernel Ridge Regression method can be performed within the search space. In the presentation, we show the effectiveness of this method for optimizing a molecular HOMO-LUMO gap as an example.
[1] R. Gómez-Bombarelli, et al, ACS Cent. Sci. 4, 268 (2018).

Presenters

  • Kyosuke Sato

    Graduate School of Natural Science and Technology, Okayama University, Okayama Univ

Authors

  • Kyosuke Sato

    Graduate School of Natural Science and Technology, Okayama University, Okayama Univ

  • Kenji Tsuruta

    Graduate School of Natural Science and Technology, Okayama University, Okayama Univ