Recurrent networks for protein structure prediction using Frenet-Serret equations and latent residue representations
ORAL
Abstract
A novel version of the Recurrent Geometrical Network (RGN1) algorithm, which geometrically reasons over protein conformations, is used to predict protein structures. We use a transfer matrix formalism, which enables reasoning over protein backbones using a discrete version of the Frenet-Serret equations (dFSE) that leverages the fact that protein backbones are intrinsically discrete one-dimensional curves. dFSE-based RGNs are used with a context-based encoding of amino acid residues – AminoBERT – derived strictly from raw amino acid sequences without making explicit use of any evolutionary information. For building AminoBERT a reformulated version of the BERT language model is used to train a transformer over protein sequences to predict missing amino acids conditioned on the flanking sequence. Amino acid residues are thus mapped onto a higher-dimensional representation.
1. AlQuraishi, M. End-to-End Differentiable Learning of Protein Structure. Cell Syst. (2019) doi:10.1016/j.cels.2019.03.006.
1. AlQuraishi, M. End-to-End Differentiable Learning of Protein Structure. Cell Syst. (2019) doi:10.1016/j.cels.2019.03.006.
–
Presenters
-
Nazim Bouatta
Harvard Medical School
Authors
-
Nazim Bouatta
Harvard Medical School