A framework for solving time-delayed Markov Decision Processes

Sarah Marzen; Yorgo Sawaya; George Issa

A framework for solving time-delayed Markov Decision Processes

ORAL

Abstract

Reinforcement learning has revolutionized our understanding of evolved systems and our ability to engineer systems based on a theoretical framework for understanding how to maximize expected reward. However, time delays between the observation and action are estimated to be roughly ~150 ms for humans, and this should affect reinforcement learning algorithms. We reformulate the Markov Decision Process framework to include time delays in action, first deriving a new Bellman equation in a way that unifies previous attempts and then implementing the corresponding SARSA-like algorithm. The main ramification-- potentially useful for both evolved and engineered systems-- is that, when the size of the state space is lower than that of the action space, the modified reinforcement learning algorithms will prefer to operate on sequences of states rather than just the present state with the length of the sequence equal to one plus the time delay.

March 21, 2023, 9:12 AM – March 21, 2023, 9:24 AM

Presenters

Sarah Marzen

Scripps, Pitzer & CMC

Authors

Sarah Marzen

Scripps, Pitzer & CMC
Yorgo Sawaya

Temple University
George Issa

U.C. Davis