APS Logo

Predicting Protein Developability via Convolutional Sequence Representation

ORAL

Abstract

Engineered proteins have emerged as novel diagnostics, therapeutics, and catalysts. Often, poor protein developability - quantified by expression, solubility, and stability - hinders commercialization. The ability to predict protein developability from amino acid sequence would reduce the experimental burden when selecting candidates. Recent advances in screening technologies enabled a high-throughput (HT) developability dataset for 105 of 1020 possible variants of protein scaffold Gp2. In this work, we evaluate the ability of neural networks to learn a developability representation from the HT dataset and transfer the knowledge to predict recombinant expression beyond the observed sequences. Mimicking protein theory, our model convolves learned amino acid properties to predict expression levels 42% closer to the experimental variance compared to a non-embedded control. Analysis of learned amino acid embeddings highlights the uniqueness of cysteine and the importance of hydrophobicity and charge, and unimportance of aromaticity, when aiming to improve developability. We identify clusters of similar sequences with increased developability through nonlinear dimensionality reduction (UMAP) and explore the inferred developability landscape via nested sampling.

Presenters

  • Alexander Golinski

    University of Minnesota, Department of Chemical Engineering and Materials Science, University of Minnnesota

Authors

  • Alexander Golinski

    University of Minnesota, Department of Chemical Engineering and Materials Science, University of Minnnesota

  • Bryce Johnson

    University of Minnesota, School of Physics and Astronomy, University of Minnesota

  • Sidharth Laxminarayan

    University of Minnesota

  • Diya Saha

    University of Minnesota, Department of Chemical Engineering and Materials Science, University of Minnnesota

  • Sandhya Appiah

    University of Minnesota, Department of Chemical Engineering and Materials Science, University of Minnnesota

  • Benjamin Hackel

    University of Minnesota

  • Stefano Martiniani

    University of Minnesota, Chemical Engineering and Materials Science, University of Minnesota, Department of Chemical Engineering and Materials Science, University of Minnesota, Department of Chemical Engineering and Materials Science, University of Minnnesota