Results from a Mapping Between Reinforcement Learning and Non-Equilibrium Statistical Mechanics
ORAL
Abstract
Reinforcement learning (RL), a field of machine learning that can be used to solve sequential decision-making problems, has recently become a popular tool for obtaining solutions to a variety of complex problems in physics. Despite this success as a tool, there has been limited work focusing on the relationship between the theoretical frameworks of RL and statistical mechanics. Our recent work has established a mapping between average-reward entropy-regularized RL and non-equilibrium statistical mechanics (NESM) using large deviation theory. We highlight how this mapping allows one to approach problems in NESM from an RL perspective and vice versa. As an example, we discuss how results from RL research on "reward shaping" can be extended using the framework of statistical mechanics of trajectories. In this setting, we derive results in RL that are analogous to the Gibbs-Bogoliubov's inequality in equilibrium statistical mechanics. We propose methods to iteratively improve this bound based on results from RL. The mapping established in our work can thus lead to new results and algorithms in both RL and NESM.
–
Publication: "Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning": J. Adamczyk, A. Arriojas, S. Tiomkin, R. V. Kulkarni; under review at AAAI-23<br><br>"Analytical framework for maximum entropy reinforcement learning using large deviation theory": A. Arriojas, J. Adamczyk, S. Tiomkin, R. V. Kulkarni; under review at Physical Review Research
Presenters
-
Jacob Adamczyk
University of Massachusetts Boston
Authors
-
Jacob Adamczyk
University of Massachusetts Boston
-
Argenis Arriojas Maldonado
University of Massachusetts Boston
-
Stas Tiomkin
San Jose State University
-
Rahul V Kulkarni
University of Massachusetts Boston