APS Logo

Recurrent networks for protein structure prediction using Frenet-Serret equations and latent residue representations

ORAL

Abstract

A novel version of the Recurrent Geometrical Network (RGN1) algorithm, which geometrically reasons over protein conformations, is used to predict protein structures. We use a transfer matrix formalism, which enables reasoning over protein backbones using a discrete version of the Frenet-Serret equations (dFSE) that leverages the fact that protein backbones are intrinsically discrete one-dimensional curves. dFSE-based RGNs are used with a context-based encoding of amino acid residues – AminoBERT – derived strictly from raw amino acid sequences without making explicit use of any evolutionary information. For building AminoBERT a reformulated version of the BERT language model is used to train a transformer over protein sequences to predict missing amino acids conditioned on the flanking sequence. Amino acid residues are thus mapped onto a higher-dimensional representation.

1. AlQuraishi, M. End-to-End Differentiable Learning of Protein Structure. Cell Syst. (2019) doi:10.1016/j.cels.2019.03.006.

Presenters

  • Nazim Bouatta

    Harvard Medical School

Authors

  • Nazim Bouatta

    Harvard Medical School