LGAIOct 12, 2021

StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning

arXiv:2110.06206v344 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses visual reinforcement learning for agents in environments like Atari and robotics, presenting an incremental improvement by introducing a Markovian-like inductive bias to existing Transformer methods.

The paper tackled the problem of visual reinforcement learning by proposing StARformer, a Transformer model that explicitly models short-term state-action-reward representations to improve long-term sequence modeling, resulting in outperforming state-of-the-art Transformer-based methods on Atari and DeepMind Control Suite benchmarks in offline-RL and imitation learning settings.

Reinforcement Learning (RL) can be considered as a sequence modeling task: given a sequence of past state-action-reward experiences, an agent predicts a sequence of next actions. In this work, we propose State-Action-Reward Transformer (StARformer) for visual RL, which explicitly models short-term state-action-reward representations (StAR-representations), essentially introducing a Markovian-like inductive bias to improve long-term modeling. Our approach first extracts StAR-representations by self-attending image state patches, action, and reward tokens within a short temporal window. These are then combined with pure image state representations -- extracted as convolutional features, to perform self-attention over the whole sequence. Our experiments show that StARformer outperforms the state-of-the-art Transformer-based method on image-based Atari and DeepMind Control Suite benchmarks, in both offline-RL and imitation learning settings. StARformer is also more compliant with longer sequences of inputs. Our code is available at https://github.com/elicassion/StARformer.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes