APS Logo

Recurrent neural networks balance sensory- and memory-guided policies for spatial foraging.

ORAL

Abstract

Animal foraging relies on spatial navigation under uncertainty; searchers can use sensory information to explore for new rewarding patches or use spatial memory to exploit previous reward patches. How do neural systems balance search strategies in the face of uncertainty in the environment? Deep reinforcement learning (RL) combines the powerful function approximation of neural networks with traditional reinforcement learning to find optimal policies in complex environments and tasks. We train recurrent actor-critic networks using meta-RL to solve a 2D spatial foraging task. We show that network training constructs population dynamics which carry out inference within a trial to perform adaptive strategies that maximize reward. The population dynamics execute single-episode switching between sensory-guided exploration and memory-guided exploitation strategies. Additionally, agents learn efficient strategies to navigate non-stationary reward distributions. These representations correlate with belief-state inference, which emerges from a model-free learning algorithm. The representational geometry of these solutions provide hypotheses for the dynamics underlying adaptive, continuous spatial decision-making strategies in neural systems.

Presenters

  • Scott Sterrett

    University of Washington

Authors

  • Scott Sterrett

    University of Washington

  • David H Gire

    University of Washington

  • Adrienne Fairhall

    University of Washington