Physics-Inspired Model Compression of Neural Networks

Daniel T Bernstein; David J Schwab

Physics-Inspired Model Compression of Neural Networks

ORAL

Abstract

Model compression is a subfield of machine learning concerned with methods by which a model can be reduced in size while minimizing negative effects on its performance. We introduce a new method, inspired by statistical physics, for the compression of a neural network: in particular, we treat the parameters of a network during training as a system of particles subjected both to gradients of the objective function and to pairwise attractive interactions, and we show how this treatment causes the parameter distribution of the trained network to concentrate around a discrete set of values. We draw explicit connections between this method and quantization, a popular form of model compression in which a model is compressed by reducing the number of bits required to represent each parameter in memory. We demonstrate that this method produces high-performance, memory-efficient networks across a range of models and tasks. We analyze the parameter distributions which result from the application of our method, and comment on surprising structural features of these distributions. We suggest our method is a powerful tool for unraveling the complexity of overparameterized networks.

March 19, 2025, 1:24 PM – March 19, 2025, 1:36 PM

Presenters

Daniel T Bernstein

Princeton University

Authors

Daniel T Bernstein

Princeton University
David J Schwab

CUNY Graduate Center, The Graduate Center, CUNY, CUNY