Analyses of the cores of AlphaFold2 protein structure predictions

Jillian Belluck; Alex T Grigas; Corey S O'Hern

Analyses of the cores of AlphaFold2 protein structure predictions

POSTER

Abstract

Developing computational methods to accurately predict the three-dimensional structure of a protein from its primary sequence of amino acids is an important and unsolved problem. AlphaFold2, a deep learning methodology developed by DeepMind to generate computational models of proteins, has been successful in recent Critical Assessment of protein Structure Prediction competitions. In the present work, we assess AlphaFold2 computational models using the number of residues in the core, a feature that is strongly correlated with protein stability. We find that while AlphaFold2's predictions for the E. coli proteome resemble X-ray crystal structures, the eukaryotic protein predictions contain too few core residues. Our analysis considers the influence of intrinsically disordered sequences on the fraction of core residues, using both AlphaFold2's per-residue confidence levels and the average charge and hydrophobicity of each protein. The variability in the core size of AlphaFold2's predictions across organisms demonstrates that while machine learning methods have increased the accuracy of computational models for protein structure, significant improvements must be made to achieve results comparable to those in experiments.

Presenters

Jillian Belluck

Brown University, Yale Univeristy

Authors

Jillian Belluck

Brown University, Yale Univeristy
Alex T Grigas

Yale University
Corey S O'Hern

Yale University