Entropy Advantage in Neural Networks Generalizability

Entao Yang; Xiaotian Zhang; Yue Shang; Ge Zhang

Entropy Advantage in Neural Networks Generalizability

ORAL

Abstract

While neural networks have been widely used to assist physics research, leveraging physical principles to study neural networks has received much less attention. Inspired by statistical physics, we introduce the concept of entropy into neural networks by reconceptualizing them as hypothetical one-dimensional physical systems where each parameter is the coordinate of a "particle". We investigate the correlation between the physical systems' entropy and neural networks' generalizability on four distinct machine learning tasks. Our results suggest an Entropy Advantage, where the high-entropy states consistently outperform the states reached via classical training optimizers like stochastic gradient descent. Separation-of-variable studies have also been performed to evaluate the controlling factors of the entropy advantage.

March 18, 2025, 1:00 PM – March 18, 2025, 1:12 PM

Presenters

Entao Yang

Air Liquide USA, Air Liquide

Authors

Entao Yang

Air Liquide USA, Air Liquide
Xiaotian Zhang

City University of Hong Kong
Yue Shang

University of Pennsylvania
Ge Zhang

City University of Hong Kong