Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning
This work addresses the problem of how artificial agents can adapt to environmental regularities, offering insights for robotics and AI, though it is incremental in applying known methods to a new domain.
The study investigated whether deep reinforcement learning agents can develop circadian-like rhythms in a foraging task with periodic environmental variations, and found that agents endogenously and adaptively internalized the rhythm, adjusting to phase shifts without retraining.
Adapting to regularities of the environment is critical for biological organisms to anticipate events and plan. A prominent example is the circadian rhythm corresponding to the internalization by organisms of the $24$-hour period of the Earth's rotation. In this work, we study the emergence of circadian-like rhythms in deep reinforcement learning agents. In particular, we deployed agents in an environment with a reliable periodic variation while solving a foraging task. We systematically characterize the agent's behavior during learning and demonstrate the emergence of a rhythm that is endogenous and entrainable. Interestingly, the internal rhythm adapts to shifts in the phase of the environmental signal without any re-training. Furthermore, we show via bifurcation and phase response curve analyses how artificial neurons develop dynamics to support the internalization of the environmental rhythm. From a dynamical systems view, we demonstrate that the adaptation proceeds by the emergence of a stable periodic orbit in the neuron dynamics with a phase response that allows an optimal phase synchronisation between the agent's dynamics and the environmental rhythm.