GSNP Dissertation Award: Statistical mechanics of Bayesian inference and learning in neural networks

Jacob Zavatone-Veth

GSNP Dissertation Award: Statistical mechanics of Bayesian inference and learning in neural networks

ORAL · Invited

Abstract

This thesis collects a few of my essays towards understanding representation learning and generalization in neural networks. I focus on the model setting of Bayesian learning and inference, where the problem of deep learning is naturally viewed through the lens of statistical mechanics. First, I consider properties of freshly-initialized deep networks, with all parameters drawn according to Gaussian priors. I provide exact solutions for the marginal prior predictive of networks with isotropic priors and linear or rectified-linear activation functions. I then study the effect of introducing structure to the priors of linear networks from the perspective of random matrix theory. Turning to memorization, I consider how the choice of nonlinear activation function affects the storage capacity of treelike neural networks. Then, we come at last to representation learning. I study the structure of learned representations in Bayesian neural networks at large but finite width, which are amenable to perturbative treatment. I then show how the ability of these networks to generalize when presented with unseen data is affected by representational flexibility, through precise comparison to models with frozen, random representations. In the final portion of this thesis, I bring a geometric perspective to bear on the structure of neural network representations. I first consider how the demand of fast inference shapes optimal representations in recurrent networks. Then, I consider the geometry of representations in deep object classification networks from a Riemannian perspective. In total, this thesis begins to elucidate the structure and function of optimally distributed neural codes in artificial neural networks.

March 19, 2025, 4:54 PM – March 19, 2025, 5:30 PM

Publication: See jzv.io for complete list of papers

Presenters

Jacob Zavatone-Veth

Harvard University

Authors

Jacob Zavatone-Veth

Harvard University