APS Logo

Criticality in Deep Neural Networks using Jacobian(s)

ORAL

Abstract

Deep Neural Networks (DNNs) have proven to be extremely successful in a variety of pattern-recognition tasks. In contrast, the analytical understanding lags far behind. The simplifying limit of infinite width offers a way to analyze such networks. In this limit DNNs become Gaussian Processes; taking on a deterministic form.

Working in this limit, we look at the “propagation of signal” through the network to identify “phases” in the space of parameter-distributions (weights and biases). Specifically, we focus on the propagation of gradients, using the Jacobian(s) of the network function. The norm of the Jacobian matrices succinctly capture the converging and diverging behavior (phases) of the gradient propagation. Furthermore, we show that the network performs optimally at the boundary of these two phases. The analysis provides us with the optimal values of parameter-initializations for training DNNs.

Presenters

  • Darshil H Doshi

    Brown University

Authors

  • Darshil H Doshi

    Brown University

  • Andrey Gromov

    Brown University

  • Tianyu He

    Brown University