A massive dataset of synthesis-friendly hypothetical polymers
ORAL
Abstract
Polymer informatics is an emerging field in materials science. It aims to build data-driven models to instantaneously predict the properties of polymers, and use this capability to screen a large candidate set of polymers to identify promising ones based on their predicted properties. However, it is important for this candidate set to include synthesizable polymers. By utilizing ~13k experimentally known polymers, we identified two distinct pathways to generate a dataset of synthesis-friendly hypothetical polymers. These pathways comprise a combinatorial assembly of retrosynthetic fragments obtained from the ~13k polymers, and a framework that treats polymers are graphs followed by graph-to-graph translations. This has resulted in a massive dataset of 100 million hypothetical but synthesis-friendly polymers. Additionally, we quantify the synthetic feasibility of each polymer as a score and demonstrate that a large portion of the generated polymers are synthesis-ready. This massive database can be used (1) for direct screening purposes using available property prediction models, and (2) within unsupervised approaches to train of generative models to enable and accelerate polymer discovery.
–
Presenters
-
Arunkumar Rajan
Georgia Institute of Technology
Authors
-
Arunkumar Rajan
Georgia Institute of Technology
-
Chiho Kim
Georgia Institute of Technology, School of Materials Science and Engineering, Georgia Institute of Technology
-
Christopher Kuenneth
Georgia Institute of Technology
-
Deepak Kamal
Georgia Tech, Georgia Institute of Technology, Georgia Inst of Tech
-
Rishi Gurnani
Georgia Institute of Technology, Georgia Inst of Tech
-
Rohit Batra
Georgia Institute of Technology
-
Rampi Ramprasad
Georgia Inst of Tech, Georgia Tech, Georgia Institute of Technology, School of Materials Science and Engineering, Georgia Institute of Technology