A theoretical eigenanalysis framework for neural autoregressive models of multi-scale chaotic dynamics
ORAL
Abstract
Recent years have seen a growing popularity of neural autoregressive models in science and engineering, especially in multi-scale chaotic dynamical systems, such as turbulent flows, weather and climate modeling, ocean modeling, etc.
Usually, these autoregressive models are represented as deep neural networks which are trained to predict the state of a system at the next time step, given its current state.
Although very successful at short-term prediction, such models often become unstable at longer time scales.
Depending on the architecture, the loss functions, and other design choices in the model, one may see a longer stability horizon, a shorter one, or eventually even realize an unstable one.
While recent work has explored different ideas inspired by numerical methods (e.g., hard-constraining the architecture with higher-order integrators, physics-inspired loss functions, etc., to improve the stability of long-term predictions in these models), an a priori diagnostic to determine the quality of a model both in terms of performance and stability, and more generally a rigorous or semi-empirical theory of inference-time stability of these models, is absent.
In this work, we use linear stability analysis from classical numerical methods to demonstrate and analyze a semi-empirical theory of stability for neural autoregressive models \textit{agnostic of architecture, integration scheme that hard-constraints the model, and loss functions}.
Using our semi-empirical theory, we propose a novel stability-promoting loss function that improves both performance and stability of neural autoregressive models of dynamical systems.
Usually, these autoregressive models are represented as deep neural networks which are trained to predict the state of a system at the next time step, given its current state.
Although very successful at short-term prediction, such models often become unstable at longer time scales.
Depending on the architecture, the loss functions, and other design choices in the model, one may see a longer stability horizon, a shorter one, or eventually even realize an unstable one.
While recent work has explored different ideas inspired by numerical methods (e.g., hard-constraining the architecture with higher-order integrators, physics-inspired loss functions, etc., to improve the stability of long-term predictions in these models), an a priori diagnostic to determine the quality of a model both in terms of performance and stability, and more generally a rigorous or semi-empirical theory of inference-time stability of these models, is absent.
In this work, we use linear stability analysis from classical numerical methods to demonstrate and analyze a semi-empirical theory of stability for neural autoregressive models \textit{agnostic of architecture, integration scheme that hard-constraints the model, and loss functions}.
Using our semi-empirical theory, we propose a novel stability-promoting loss function that improves both performance and stability of neural autoregressive models of dynamical systems.
–
Presenters
-
Ashesh K Chattopadhyay
University of California, Santa Cruz
Authors
-
Ashesh K Chattopadhyay
University of California, Santa Cruz
-
Conrad S Ainslie
University of California, Santa Cruz
-
Pedram Hassanzadeh
University of Chicago
-
Michael Mahoney
UC Berkeley