LGSep 11, 2024

Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning

Wenhao Zhao, Qiushui Xu, Linjie Xu, Lei Song, Jinyu Wang, Chunlai Zhou, Jiang Bian

arXiv:2409.06985v27.92 citationsh-index: 27

Originality Incremental advance

AI Analysis

This addresses a bottleneck in using pretrained language models for offline RL by improving adaptability to long-term tasks, though it is incremental as it builds on existing decision transformer methods.

The paper identifies a 'Markov head' in pretrained language models that causes extreme attention on the last token, limiting performance to short-term environments in offline reinforcement learning, and proposes GPT2-DTMA with Mixture of Attention to improve long-term performance, achieving comparable results in short-term and significantly narrowing the gap in long-term environments.

Recently, incorporating knowledge from pretrained language models (PLMs) into decision transformers (DTs) has generated significant attention in offline reinforcement learning (RL). These PLMs perform well in RL tasks, raising an intriguing question: what kind of knowledge from PLMs has been transferred to RL to achieve such good results? This work first dives into this problem by analyzing each head quantitatively and points out Markov head, a crucial component that exists in the attention heads of PLMs. It leads to extreme attention on the last-input token and performs well only in short-term environments. Furthermore, we prove that this extreme attention cannot be changed by re-training embedding layer or fine-tuning. Inspired by our analysis, we propose a general method GPT2-DTMA, which equips a pretrained DT with Mixture of Attention (MoA), to accommodate diverse attention requirements during fine-tuning. Extensive experiments corroborate our theorems and demonstrate the effectiveness of GPT2-DTMA: it achieves comparable performance in short-term environments while significantly narrowing the performance gap in long-term environments.

View on arXiv PDF

Similar