Recurrent neural networks balance sensory- and memory-guided policies for spatial foraging.
ORAL
Abstract
Animal foraging relies on spatial navigation under uncertainty; searchers can use sensory information to explore for new rewarding patches or use spatial memory to exploit previous reward patches. How do neural systems balance search strategies in the face of uncertainty in the environment? Deep reinforcement learning (RL) combines the powerful function approximation of neural networks with traditional reinforcement learning to find optimal policies in complex environments and tasks. We train recurrent actor-critic networks using meta-RL to solve a 2D spatial foraging task. We show that network training constructs population dynamics which carry out inference within a trial to perform adaptive strategies that maximize reward. The population dynamics execute single-episode switching between sensory-guided exploration and memory-guided exploitation strategies. Additionally, agents learn efficient strategies to navigate non-stationary reward distributions. These representations correlate with belief-state inference, which emerges from a model-free learning algorithm. The representational geometry of these solutions provide hypotheses for the dynamics underlying adaptive, continuous spatial decision-making strategies in neural systems.
–
Presenters
-
Scott Sterrett
University of Washington
Authors
-
Scott Sterrett
University of Washington
-
David H Gire
University of Washington
-
Adrienne Fairhall
University of Washington