APS Logo

Results from a Mapping Between Reinforcement Learning and Non-Equilibrium Statistical Mechanics

ORAL

Abstract

Reinforcement learning (RL), a field of machine learning that can be used to solve sequential decision-making problems, has recently become a popular tool for obtaining solutions to a variety of complex problems in physics. Despite this success as a tool, there has been limited work focusing on the relationship between the theoretical frameworks of RL and statistical mechanics. Our recent work has established a mapping between average-reward entropy-regularized RL and non-equilibrium statistical mechanics (NESM) using large deviation theory. We highlight how this mapping allows one to approach problems in NESM from an RL perspective and vice versa. As an example, we discuss how results from RL research on "reward shaping" can be extended using the framework of statistical mechanics of trajectories. In this setting, we derive results in RL that are analogous to the Gibbs-Bogoliubov's inequality in equilibrium statistical mechanics. We propose methods to iteratively improve this bound based on results from RL. The mapping established in our work can thus lead to new results and algorithms in both RL and NESM.

Publication: "Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning": J. Adamczyk, A. Arriojas, S. Tiomkin, R. V. Kulkarni; under review at AAAI-23<br><br>"Analytical framework for maximum entropy reinforcement learning using large deviation theory": A. Arriojas, J. Adamczyk, S. Tiomkin, R. V. Kulkarni; under review at Physical Review Research

Presenters

  • Jacob Adamczyk

    University of Massachusetts Boston

Authors

  • Jacob Adamczyk

    University of Massachusetts Boston

  • Argenis Arriojas Maldonado

    University of Massachusetts Boston

  • Stas Tiomkin

    San Jose State University

  • Rahul V Kulkarni

    University of Massachusetts Boston