LGMay 26, 2023

Reinforcement Learning with Simple Sequence Priors

arXiv:2305.17109v115 citations
Originality Highly original
AI Analysis

This work addresses the challenge of improving efficiency and robustness in reinforcement learning for continuous control, though it is incremental by building on existing model-free approaches.

The paper tackled the problem of incorporating temporal simplicity into reinforcement learning by proposing an algorithm that uses compressible action sequences as priors, resulting in faster learning and higher returns than state-of-the-art methods in continuous control tasks.

Everything else being equal, simpler models should be preferred over more complex ones. In reinforcement learning (RL), simplicity is typically quantified on an action-by-action basis -- but this timescale ignores temporal regularities, like repetitions, often present in sequential strategies. We therefore propose an RL algorithm that learns to solve tasks with sequences of actions that are compressible. We explore two possible sources of simple action sequences: Sequences that can be learned by autoregressive models, and sequences that are compressible with off-the-shelf data compression algorithms. Distilling these preferences into sequence priors, we derive a novel information-theoretic objective that incentivizes agents to learn policies that maximize rewards while conforming to these priors. We show that the resulting RL algorithm leads to faster learning, and attains higher returns than state-of-the-art model-free approaches in a series of continuous control tasks from the DeepMind Control Suite. These priors also produce a powerful information-regularized agent that is robust to noisy observations and can perform open-loop control.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes