Results from a Mapping Between Reinforcement Learning and Non-Equilibrium Statistical Mechanics

Jacob Adamczyk; Argenis Arriojas Maldonado; Stas Tiomkin; Rahul V Kulkarni

Results from a Mapping Between Reinforcement Learning and Non-Equilibrium Statistical Mechanics

ORAL

Abstract

Reinforcement learning (RL), a field of machine learning that can be used to solve sequential decision-making problems, has recently become a popular tool for obtaining solutions to a variety of complex problems in physics. Despite this success as a tool, there has been limited work focusing on the relationship between the theoretical frameworks of RL and statistical mechanics. Our recent work has established a mapping between average-reward entropy-regularized RL and non-equilibrium statistical mechanics (NESM) using large deviation theory. We highlight how this mapping allows one to approach problems in NESM from an RL perspective and vice versa. As an example, we discuss how results from RL research on "reward shaping" can be extended using the framework of statistical mechanics of trajectories. In this setting, we derive results in RL that are analogous to the Gibbs-Bogoliubov's inequality in equilibrium statistical mechanics. We propose methods to iteratively improve this bound based on results from RL. The mapping established in our work can thus lead to new results and algorithms in both RL and NESM.

March 6, 2023, 6:36 PM – March 6, 2023, 6:48 PM

Publication: "Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning": J. Adamczyk, A. Arriojas, S. Tiomkin, R. V. Kulkarni; under review at AAAI-23<br><br>"Analytical framework for maximum entropy reinforcement learning using large deviation theory": A. Arriojas, J. Adamczyk, S. Tiomkin, R. V. Kulkarni; under review at Physical Review Research

Presenters

Jacob Adamczyk

University of Massachusetts Boston

Authors

Jacob Adamczyk

University of Massachusetts Boston
Argenis Arriojas Maldonado

University of Massachusetts Boston
Stas Tiomkin

San Jose State University
Rahul V Kulkarni

University of Massachusetts Boston