RO AI LGSep 18, 2023

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Yevgen Chebotar, Quan Vuong, Alex Irpan, Karol Hausman, Fei Xia, Yao Lu, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz

arXiv:2309.10150v234.1157 citationsh-index: 166

Originality Incremental advance

AI Analysis

This addresses the challenge of scalable offline reinforcement learning for robotics, enabling better policy training from mixed human and autonomous data, though it is incremental as it builds on existing Q-learning and Transformer techniques.

The paper tackles the problem of training multi-task robotic manipulation policies from large offline datasets by introducing Q-Transformer, a method that uses a Transformer to represent Q-functions via discretized actions, and it outperforms prior offline RL and imitation learning techniques on a diverse real-world task suite.

In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We therefore refer to the method as Q-Transformer. By discretizing each action dimension and representing the Q-value of each action dimension as separate tokens, we can apply effective high-capacity sequence modeling techniques for Q-learning. We present several design decisions that enable good performance with offline RL training, and show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite. The project's website and videos can be found at https://qtransformer.github.io

View on arXiv PDF

Similar