Publication: Stochastic Resetting of Reinforcement Learning Agents
Files
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Stochastic resetting -- the strategy of randomly restarting a search process -- has been shown to optimize first-passage times across a large set of physical and biological systems. In this thesis, we apply stochastic resetting to reinforcement learning (RL) agents, aiming to understand its effects on exploration efficiency and learning dynamics. Beginning with a review of stochastic resetting in simple diffusive and random walk systems, we extend these ideas to ε-greedy Q-learning agents operating in a bounded two-dimensional grid environment. Through numerical simulations, we find that despite stochastic resetting not minimizing first-passage times in our simulation geometry, it can still significantly accelerate learning by reducing the number of training steps required to reach optimal policies. We identify characteristic signatures of learning dynamics, such as a sharp spike in episode length relative variance and a universal intersection point across training curves with fixed exploration rate and different resetting rates. Moreover, we demonstrate that even small nonzero resetting rates enhance learning efficiency compared to no resetting. These findings suggest that stochastic resetting may be a broadly applicable tool for accelerating learning processes in both artificial and biological systems and point to potential avenues of further numerical and analytical investigation.