Physical Reinforcement Learning
This work addresses the challenge of energy-efficient and robust autonomous agents in uncertain environments, though it is incremental as it adapts existing methods to a new type of network.
The paper tackled the problem of adapting analog Contrastive Local Learning Networks (CLLNs) for reinforcement learning, demonstrating success on two simple RL tasks using Q-learning adapted for simulated CLLNs, which are inherently low-power and robust to damage.
Digital computers are power-hungry and largely intolerant of damaged components, making them potentially difficult tools for energy-limited autonomous agents in uncertain environments. Recently developed Contrastive Local Learning Networks (CLLNs) - analog networks of self-adjusting nonlinear resistors - are inherently low-power and robust to physical damage, but were constructed to perform supervised learning. In this work we demonstrate success on two simple RL problems using Q-learning adapted for simulated CLLNs. Doing so makes explicit the components (beyond the network being trained) required to enact various tools in the RL toolbox, some of which (policy function and value function) are more natural in this system than others (replay buffer). We discuss assumptions such as the physical safety that digital hardware requires, CLLNs can forgo, and biological systems cannot rely on, and highlight secondary goals that are important in biology and trainable in CLLNs, but make little sense in digital computers.