APS Logo

How do neural networks learn simple functions?

ORAL · Invited

Abstract

To illustrate how concepts and methods at the intersection of high-dimensional probability and statistical physics offer fresh insights into neural networks, I will explore how progress can be made by studying simple models and questions. This talk will cover the mechanisms by which two- (and more) layer neural networks learn simple high-dimensional functions from data over time. I will emphasize the complex interplay between algorithms, iterations, and task complexity, demonstrating how gradient descent and stochastic gradient descent can capture specific features of functions and enhance generalization beyond random initialization and kernel methods.

Publication: * How Two-Layer Neural Networks Learn, One (Giant) Step at a Time, Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan, to appear in JMLR<br>* The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents, Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborová, Florent Krzakala, ICML 2024<br>* Fundamental computational limits of weak learnability in high-dimensional multi-index models<br>Emanuele Troiani, Yatin Dandi, Leonardo Defilippis, Lenka Zdeborová, Bruno Loureiro, Florent Krzakala, arXiv:2405.15480<br>* Repetita iuvant: Data repetition allows sgd to learn high-dimensional multi-index functions<br>L Arnaboldi, Y Dandi, F Krzakala, L Pesce, L Stephan, preprint arXiv:2405.15459

Presenters

  • FLORENT KRZAKALA

    EPFL

Authors

  • FLORENT KRZAKALA

    EPFL