LG AIMar 3, 2023

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

Yuanying Cai, Chuheng Zhang, Wei Shen, Xuyun Zhang, Wenjie Ruan, Longbo Huang

Tsinghua

arXiv:2303.01668v18.87 citationsh-index: 13

Originality Incremental advance

AI Analysis

This work addresses representation learning for RL practitioners, offering a simple and scalable method that improves sample efficiency and transfer capabilities, though it appears incremental by building on existing masked modeling techniques.

The paper tackles the problem of representation pre-training in reinforcement learning by proposing RePreM, a masked model that predicts masked states or actions in trajectories, which is shown to be effective in dynamic prediction, transfer learning, and sample-efficient RL, scaling well with dataset size and encoder scale.

Inspired by the recent success of sequence modeling in RL and the use of masked language model for pre-training, we propose a masked model for pre-training in RL, RePreM (Representation Pre-training with Masked Model), which trains the encoder combined with transformer blocks to predict the masked states or actions in a trajectory. RePreM is simple but effective compared to existing representation pre-training methods in RL. It avoids algorithmic sophistication (such as data augmentation or estimating multiple models) with sequence modeling and generates a representation that captures long-term dynamics well. Empirically, we demonstrate the effectiveness of RePreM in various tasks, including dynamic prediction, transfer learning, and sample-efficient RL with both value-based and actor-critic methods. Moreover, we show that RePreM scales well with dataset size, dataset quality, and the scale of the encoder, which indicates its potential towards big RL models.

View on arXiv PDF

Similar