APS Logo

PolyUniverse: generation of a large-scale polymer library using rule-based polymerization reactions for polymer informatic

ORAL

Abstract

Recent breakthroughs in machine learning have transformed polymer research, accelerating the integration of diverse computational techniques for de novo molecular design. A key focus of these efforts is to significantly expand the number of candidate polymer structures, as the pool of known real polymers remains limited. In contrast, small molecule databases, such as those used in drug discovery, are vast and provide numerous opportunities for new molecular designs. In this study, we curated a large set of small molecule compounds from GDB-17, GDB-13, and PubChem and selected polymerization reaction pathways for eight types of polymers: polyimide, polyolefin, polyester, polyamide, polyurethane, epoxy, polybenzimidazole (PBI), and vitrimer. Leveraging these small molecule datasets and polymerization reactions, we generated hundreds of quadrillions of hypothetical polymer structures. For each of the eight polymers, plus one promising copolymer, poly(imide-imine), we randomly generated over one million hypothetical structures, with the exception of PBI, for which we produced 10,000 structures. To evaluate the feasibility of synthesizing these new polymers, we utilized t-distributed stochastic neighbor embedding for chemical space visualization and synthetic accessibility scores. Additionally, customized feedforward neural network models predicted the thermal, mechanical, and gas permeation properties of both real and hypothetical polymers. The findings reveal that many hypothetical polymers, particularly polyimides, show exceptional potential, often outperforming real polymers in high-temperature applications and gas separation. This study underscores the immense value of large-scale hypothetical polymer libraries in materials discovery and design. Such libraries not only help identify promising new polymer materials through high-throughput screening but also offer valuable datasets for training advanced machine learning models, including large language models. Overall, this research highlights the power of data-driven approaches in polymer science, paving the way for the development of next-generation polymeric materials with superior properties for various industrial applications.

Publication: Yue, Tianle, Jianxin He, and Ying Li. "PolyUniverse: Generation of a Large-scale Polymer Library Using Rule-Based Polymerization Reactions for Polymer Informatics."

Presenters

  • Tianle Yue

    University of Wisconsin-Madison

Authors

  • Tianle Yue

    University of Wisconsin-Madison

  • Ying Li

    University of Wisconsin - Madison

  • Jianxin He

    University of Wisconsin-Madison