APS Logo

The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold

ORAL · Invited

Abstract

Deep neural networks are capable of exploring arbitrary hypotheses. However, we demonstrate that when paired with a concrete task and algorithm, the training of deep networks explores only a tiny fraction of the space of available hypotheses. To visualize and identify this space, we developed an information-geometric lens. Focusing this lens on our experimental data uncovers that networks with many different architectures, trained with different optimization procedures and regularization techniques, traverse the same manifold. Moreover, networks trained on different tasks also lie on a low-dimensional manifold.

We study the details of this manifold and find that networks with different architectures follow distinguishable trajectories, while other factors have minimal influence: larger networks train along a similar manifold as smaller networks, just faster; and networks initialized at very different points in the prediction space converge to solutions along a similar manifold. We analytically predict this phenomenon for linear networks, showing that it critically depends on the structure of the task.

This work was conducted in collaboration with Pratik Chaudhari (University of Pennsylvania), Jialin Mao (University of Pennsylvania), Rahul Ramesh (University of Pennsylvania), Rubing Yang (University of Pennsylvania), Mark Transtrum (Brigham Young University), Han Kheng Teoh (Cornell University), and James P. Sethna (Cornell University).

Publication: 1. Mao, J., Griniasty, I., Teoh, H.K., Ramesh, R., Yang, R., Transtrum, M.K., Sethna, J.P. and Chaudhari, P., 2024. The training process of many deep networks explores the same low-dimensional manifold. Proceedings of the National Academy of Sciences, 121(12), p.e2310002121.<br>2. Ramesh, R., Mao, J., Griniasty, I., Yang, R., Teoh, H.K., Transtrum, M., Sethna, J.P. and Chaudhari, P., 2023. A picture of the space of typical learnable tasks. Proc. of International Conference of Machine Learning (ICML).

Presenters

  • Itay Griniasty

    Cornell University

Authors

  • Itay Griniasty

    Cornell University