LG AISep 20, 2022

Locally Constrained Representations in Reinforcement Learning

Somjit Nath, Rushiv Arora, Samira Ebrahimi Kahou

arXiv:2209.09441v21.8h-index: 26

Originality Incremental advance

AI Analysis

This work addresses the challenge of representation overfitting in RL, which is crucial for improving learning stability and efficiency in domains like robotics and autonomous systems, though it is incremental as it builds on existing representation learning methods.

The paper tackles the problem of learning robust state representations in reinforcement learning by proposing locally constrained representations, which use an auxiliary loss to make representations predictable from neighboring states, resulting in significant performance improvements in continuous control tasks.

The success of Reinforcement Learning (RL) heavily relies on the ability to learn robust representations from the observations of the environment. In most cases, the representations learned purely by the reinforcement learning loss can differ vastly across states depending on how the value functions change. However, the representations learned need not be very specific to the task at hand. Relying only on the RL objective may yield representations that vary greatly across successive time steps. In addition, since the RL loss has a changing target, the representations learned would depend on how good the current values/policies are. Thus, disentangling the representations from the main task would allow them to focus not only on the task-specific features but also the environment dynamics. To this end, we propose locally constrained representations, where an auxiliary loss forces the state representations to be predictable by the representations of the neighboring states. This encourages the representations to be driven not only by the value/policy learning but also by an additional loss that constrains the representations from over-fitting to the value loss. We evaluate the proposed method on several known benchmarks and observe strong performance. Especially in continuous control tasks, our experiments show a significant performance improvement.

View on arXiv PDF

Similar