APS Logo

Designing a Protein-Protein Interface Scoring Function Using Graph Neural Networks

ORAL

Abstract

Machine learning techniques have been applied to numerous biological questions, including the scoring of predicted protein-protein interfaces (PPIs). Graph neural networks (GNNs) have been proposed as an optimal tool for PPI scoring due to their ability to map three-dimensional protein conformations onto graphs of nodes and edges with rotational invariance and no loss of information. However, in a comparison study of state-of-the-art PPI scoring functions, we found that GNN-based models were consistently outperformed by non-GNN-based scoring functions. Notably, we found that for a dataset of 84 unique heterodimers, counting the number of contacts between heavy atoms at the interface yields higher Spearman correlations with the ground truth score DockQ than two recent GNN-based models, DeepRank-GNN-ESM and GNN_DOVE.

We propose that the poor performance of GNN-based scoring functions is caused by a combination of factors, including a narrow distribution of protein model ground truth scores within GNN training sets and uninformative node and edge features. We present a novel GNN-based PPI scoring function, which is trained on a dataset of models that has been uniformly sampled over DockQ, with an even split between near- and non-native models. This scoring function uses local geometric and physical node and edge features, including electrostatic and hydrophobic interactions, and local packing fraction. We compare the performance our GNN-based scoring function to other current scoring functions.

Presenters

  • Naomi Brandt

    Yale University

Authors

  • Naomi Brandt

    Yale University

  • Jacob Sumner

    Yale University

  • Devon Finlay

    Yale University

  • Corey S O'Hern

    Yale University