The Role of Data in the Sloppiness of Deep Networks

Pratik Chaudhari; Rubing Yang; Jialin Mao

The Role of Data in the Sloppiness of Deep Networks

ORAL

Abstract

We study how the dataset may be the cause of the anomalous generalization performance of deep networks. We show that the data correlation matrix of typical classification datasets has an eigenspectrum where, after a sharp initial drop, a large number of small eigenvalues are distributed uniformly over an exponentially large range. This structure is mirrored in a network trained on this data: we show that the Hessian and the Fisher Information Matrix (FIM) have eigenvalues that are spread uniformly over exponentially large ranges. For such ``sloppy'' eigenspectra, sets of weights corresponding to small eigenvalues can be modified by large magnitudes without affecting the loss. Networks trained on atypical, non-sloppy synthetic data do not share these traits. We show how this structure in the data sheds light on the generalization performance of deep networks using PAC-Bayesian analysis.

March 15, 2022, 10:00 AM – March 15, 2022, 10:12 AM

Presenters

Pratik Chaudhari

University of Pennsylvania

Authors

Pratik Chaudhari

University of Pennsylvania
Rubing Yang

University of Pennsylvania
Jialin Mao

University of Pennsylvania