Non-Gaussian effects in finite Bayesian neural networks
ORAL
Abstract
Bayesian neural networks (BNNs) are theoretically well-understood only in the infinite-width limit, where Gaussian priors over network weights yield Gaussian priors over network outputs. Recent works have suggested that finite BNNs may outperform their infinite cousins because finite networks can adapt their internal representations of data, but our understanding of the non-Gaussian priors and learned hidden layer representations of finite networks remains incomplete. Here, we take two steps towards an understanding of non-Gaussian effects in finite BNNs. First, we argue that the leading finite-width corrections to the feature kernels for any BNN with linear readout have a largely universal form. We illustrate this for three tractable network architectures: deep linear fully-connected and convolutional networks, and networks with a single nonlinear hidden layer. Second, we derive exact solutions for the marginal function space priors of a class of finite feedforward BNNs. These results unify previous descriptions of finite BNN priors in terms of their tail decay and asymptotics. In total, our work begins to elucidate how finite BNNs differ from their infinite cousins.
–
Publication: Preprints: https://arxiv.org/abs/2104.11734, https://arxiv.org/abs/2106.00651
Presenters
-
Jacob Zavatone-Veth
Harvard University
Authors
-
Jacob Zavatone-Veth
Harvard University
-
Abdulkadir Canatar
Harvard University
-
Benjamin S Ruben
Harvard University
-
Cengiz Pehlevan
Harvard University