Recurrent networks for protein structure prediction using Frenet-Serret equations and latent residue representations

Nazim Bouatta

Recurrent networks for protein structure prediction using Frenet-Serret equations and latent residue representations

ORAL

Abstract

A novel version of the Recurrent Geometrical Network (RGN¹) algorithm, which geometrically reasons over protein conformations, is used to predict protein structures. We use a transfer matrix formalism, which enables reasoning over protein backbones using a discrete version of the Frenet-Serret equations (dFSE) that leverages the fact that protein backbones are intrinsically discrete one-dimensional curves. dFSE-based RGNs are used with a context-based encoding of amino acid residues – AminoBERT – derived strictly from raw amino acid sequences without making explicit use of any evolutionary information. For building AminoBERT a reformulated version of the BERT language model is used to train a transformer over protein sequences to predict missing amino acids conditioned on the flanking sequence. Amino acid residues are thus mapped onto a higher-dimensional representation.

1. AlQuraishi, M. End-to-End Differentiable Learning of Protein Structure. Cell Syst. (2019) doi:10.1016/j.cels.2019.03.006.

March 15, 2021, 6:36 PM – March 15, 2021, 6:48 PM

Presenters

Nazim Bouatta

Harvard Medical School

Authors

Nazim Bouatta

Harvard Medical School