'Interacting' parametric models of learning: insights into the adaptation to data

Yasaman Bahri

'Interacting' parametric models of learning: insights into the adaptation to data

ORAL · Invited

Abstract

Developing realistic theories of learning in artificial neural networks that capture the adaptation of a model's features to data remains an outstanding challenge and an essential ingredient for understanding modern AI. At a more fundamental level, understanding how nonlinear parametric models adapt to data from the natural world is relevant for developing theories of learning for physical systems – whether natural, as in soft matter and biological systems, or artificial. In this talk, I will consider quadratic models – a class of parametric models which are quadratic in their parameters and nonlinear in their inputs – that are rich enough to enable the learning of data-driven representations but simple enough to lend new insight; as a special case, they can also serve as an effective theory of neural networks. When learning is cast as a statistical physics problem, they give rise to 'interacting' theories. I will consider their dynamics and learning through two methods from statistical physics, (i) dynamical mean-field theory and (ii) the replica method, in a large-N limit of the problem. I will show how we can derive new scaling laws capturing representation learning and how the scaling behavior depends intimately on power-law structure present in data from the natural world.

March 18, 2025, 7:12 PM – March 18, 2025, 7:48 PM

Presenters

Yasaman Bahri

Google DeepMind

Authors

Yasaman Bahri

Google DeepMind