Exploring the Relationship Between SGD Noise, Hessian Structure, and Neuron Functionality in Artificial Neural Networks

Yikuan Zhang; Ning Yang; Qi Ouyang; Yuhai Tu

Exploring the Relationship Between SGD Noise, Hessian Structure, and Neuron Functionality in Artificial Neural Networks

ORAL

Abstract

The artificial neural networks (ANN) exhibit remarkable generalizability. During training, some neurons undergo significant transformations, acquiring distinct functional roles. The training process is typically driven by the prevalent optimizer, stochastic gradient descend (SGD). Empirical studies suggest that the noise structure inherent to SGD strongly correlates with the local Hessian of the loss landscape—a relationship that plays a critical role in finding solutions that generalize well. Beyond guiding the optimization process, this relationship also interacts with other intrinsic properties of the network. The number of relative sharp eigendirections of the Hessian as well as the activated neurons also increase as the climbing complexity of the data. Besides, the permutation symmetry of neurons within each hidden layer allows for multiple equivalent configurations of the network parameters. As training progresses, this symmetry permits further refinement in how neurons align their functional roles. Consequently, the maximum cosine similarity after permutation between weight vectors of any pair of parallelly trained ANNs can be enhanced, indicating improved structural alignment after training. In this talk, we will discuss 1) how the noise covariance relates to the hessian through its dependence on the hessians of individual sample loss, then move on to 2) the behind mechanism of the enhancement of maximum cosine similarity and its relation to the architecture of the network and the complexity of the data.

March 18, 2025, 8:24 PM – March 18, 2025, 8:36 PM

Presenters

Yikuan Zhang

Peking Univ

Authors

Yikuan Zhang

Peking Univ
Ning Yang

Peking Univ
Qi Ouyang

Peking Univ
Yuhai Tu

IBM Thomas J. Watson Research Center