Asymptotic stability of the neural network and its generalization power
ORAL
Abstract
The generalization power of neural networks is of great importance in both the theoretical and practical development of neural network models. Through the theoretical framework of dynamical stability analysis, we find that a fully connected neural network’s generalization power is associated with its asymptotic stability, i.e., a network that is more unstable in its asymptotic fixed points has lower generalization power due to the emergence of chaotic behaviors. We further find that the neural network’s training is a random-walk like diffusion process toward chaos in the parameter space, and the regularization technique of weight decay effectively reserves it. Specifically, regularization is only effective by pulling the model out of the unstable phase. Once the model is in the stable phase, test losses are similar regardless of how large the regularization strength is. Therefore, the model at the boundary of stability will achieve a balance between underfitting and overfitting. Based on this, we propose a method to calculate a lower bound for the regularization strength which could maintain the model at the boundary of stability. Lastly, the analogy with spin glasses also tells us why the training process deviates from the random walk behavior after it enters the chaos phase.
–
Presenters
-
Lin Zhang
Department of Physics, National University of Singapore
Authors
-
Lin Zhang
Department of Physics, National University of Singapore
-
Ling Feng
Institute of High Performance Computing, A*STAR, Institute of High Performance Computing, A*STAR Singapore
-
Kan Chen
Risk Management Institute, National University of Singapore
-
Choy Heng Lai
Department of Physics, National University of Singapore, Physics, National University of Singapore