How do neural networks learn simple functions?

FLORENT KRZAKALA

How do neural networks learn simple functions?

ORAL · Invited

Abstract

To illustrate how concepts and methods at the intersection of high-dimensional probability and statistical physics offer fresh insights into neural networks, I will explore how progress can be made by studying simple models and questions. This talk will cover the mechanisms by which two- (and more) layer neural networks learn simple high-dimensional functions from data over time. I will emphasize the complex interplay between algorithms, iterations, and task complexity, demonstrating how gradient descent and stochastic gradient descent can capture specific features of functions and enhance generalization beyond random initialization and kernel methods.

March 18, 2025, 6:00 PM – March 18, 2025, 6:36 PM

Publication: * How Two-Layer Neural Networks Learn, One (Giant) Step at a Time, Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan, to appear in JMLR * The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents, Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborová, Florent Krzakala, ICML 2024 * Fundamental computational limits of weak learnability in high-dimensional multi-index models Emanuele Troiani, Yatin Dandi, Leonardo Defilippis, Lenka Zdeborová, Bruno Loureiro, Florent Krzakala, arXiv:2405.15480 * Repetita iuvant: Data repetition allows sgd to learn high-dimensional multi-index functions L Arnaboldi, Y Dandi, F Krzakala, L Pesce, L Stephan, preprint arXiv:2405.15459

Presenters

FLORENT KRZAKALA

EPFL

Authors

FLORENT KRZAKALA

EPFL