Entropy Advantage in Neural Networks Generalizability
ORAL
Abstract
While neural networks have been widely used to assist physics research, leveraging physical principles to study neural networks has received much less attention. Inspired by statistical physics, we introduce the concept of entropy into neural networks by reconceptualizing them as hypothetical one-dimensional physical systems where each parameter is the coordinate of a "particle". We investigate the correlation between the physical systems' entropy and neural networks' generalizability on four distinct machine learning tasks. Our results suggest an Entropy Advantage, where the high-entropy states consistently outperform the states reached via classical training optimizers like stochastic gradient descent. Separation-of-variable studies have also been performed to evaluate the controlling factors of the entropy advantage.
–
Presenters
-
Entao Yang
Air Liquide USA, Air Liquide
Authors
-
Entao Yang
Air Liquide USA, Air Liquide
-
Xiaotian Zhang
City University of Hong Kong
-
Yue Shang
University of Pennsylvania
-
Ge Zhang
City University of Hong Kong