Equivariant Reinforcement Learning under Partial Observability
This work addresses sample efficiency challenges for robot learning in partially observable environments, representing an incremental advancement by applying equivariance to a known bottleneck.
The paper tackled the problem of sample inefficiency in partially observable robotic domains by incorporating equivariance as an inductive bias into actor-critic reinforcement learning agents, resulting in significant improvements in sample efficiency and final performance over non-equivariant approaches in simulation and real hardware experiments.
Incorporating inductive biases is a promising approach for tackling challenging robot learning domains with sample-efficient solutions. This paper identifies partially observable domains where symmetries can be a useful inductive bias for efficient learning. Specifically, by encoding the equivariance regarding specific group symmetries into the neural networks, our actor-critic reinforcement learning agents can reuse solutions in the past for related scenarios. Consequently, our equivariant agents outperform non-equivariant approaches significantly in terms of sample efficiency and final performance, demonstrated through experiments on a range of robotic tasks in simulation and real hardware.