Criticality in Deep Neural Networks using Jacobian(s)

Darshil H Doshi; Andrey Gromov; Tianyu He

Criticality in Deep Neural Networks using Jacobian(s)

ORAL

Abstract

Deep Neural Networks (DNNs) have proven to be extremely successful in a variety of pattern-recognition tasks. In contrast, the analytical understanding lags far behind. The simplifying limit of infinite width offers a way to analyze such networks. In this limit DNNs become Gaussian Processes; taking on a deterministic form.

Working in this limit, we look at the “propagation of signal” through the network to identify “phases” in the space of parameter-distributions (weights and biases). Specifically, we focus on the propagation of gradients, using the Jacobian(s) of the network function. The norm of the Jacobian matrices succinctly capture the converging and diverging behavior (phases) of the gradient propagation. Furthermore, we show that the network performs optimally at the boundary of these two phases. The analysis provides us with the optimal values of parameter-initializations for training DNNs.

March 15, 2022, 6:12 PM – March 15, 2022, 6:24 PM

Presenters

Darshil H Doshi

Brown University

Authors

Darshil H Doshi

Brown University
Andrey Gromov

Brown University
Tianyu He

Brown University