Deep Dive into Model-free Reinforcement Learning for Biological and Robotic Systems: Theory and Practice
It offers a framework for researchers in biology and robotics to derive sensorimotor strategies and design rules, though it is incremental as it builds on existing methods without introducing new paradigms.
The paper tackles the challenge of applying model-free reinforcement learning, specifically actor-critic methods, to understand and design feedback control in biological and robotic systems, providing a concise mathematical and algorithmic exposition without specific numerical results.
Animals and robots exist in a physical world and must coordinate their bodies to achieve behavioral objectives. With recent developments in deep reinforcement learning, it is now possible for scientists and engineers to obtain sensorimotor strategies (policies) for specific tasks using physically simulated bodies and environments. However, the utility of these methods goes beyond the constraints of a specific task; they offer an exciting framework for understanding the organization of an animal sensorimotor system in connection to its morphology and physical interaction with the environment, as well as for deriving general design rules for sensing and actuation in robotic systems. Algorithms and code implementing both learning agents and environments are increasingly available, but the basic assumptions and choices that go into the formulation of an embodied feedback control problem using deep reinforcement learning may not be immediately apparent. Here, we present a concise exposition of the mathematical and algorithmic aspects of model-free reinforcement learning, specifically through the use of \textit{actor-critic} methods, as a tool for investigating the feedback control underlying animal and robotic behavior.