APS Logo

A Machine Learned Model for Solid Form Volume Estimation Based on Packing-Accessible Surface and Molecular Topological Fragments

ORAL

Abstract

We present a machine learned model for predicting the volume of a homomolecular crystal from the single molecule structure. The model is based on two descriptors: the volume enclosed by the packing-accessible surface and molecular topological fragments. To calculate the volume enclosed by the molecular surface we have developed a new "projected marching cubes" algorithm. The packing-accessible surface is then calculated using an optimized probe radius. The molecular topological fragments are used to construct a representation that captures the bonding environments of the atoms in the molecule. Feature selection is used to determine which fragments to include in the model. The accuracy and robustness of the model may be attributed to including both geometric and chemical features. The volume enclosed by the packing-accessible surface accounts for the presence of voids and sterically hindered regions as well as for the effect of conformational changes. The molecular topological fragments account for the effect of intermolecular interactions on the packing density. The model is trained on a dataset of structures extracted from the Cambridge Structural Database. Excellent performance is demonstrated for three validation sets of unseen data.

Presenters

  • Imanuel Bier

    Carnegie Mellon Univ

Authors

  • Imanuel Bier

    Carnegie Mellon Univ

  • Noa Marom

    Carnegie Mellon Univ