LG AI NEMar 25, 2022

Unsupervised Learning of Temporal Abstractions with Slot-based Transformers

Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste

arXiv:2203.13573v212.420 citationsh-index: 100Has Code

Originality Incremental advance

AI Analysis

This addresses the limitation of sequential processing in unsupervised learning of sub-routines for decision-making in complex reinforcement learning, representing an incremental improvement.

The paper tackled the problem of learning temporal abstractions in reinforcement learning by proposing SloTTAr, a parallel approach that integrates Transformers with Slot Attention, which outperformed baselines in boundary point discovery and was up to 7x faster to train.

The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems. Previous approaches propose to learn such temporal abstractions in a purely unsupervised fashion through observing state-action trajectories gathered from executing a policy. However, a current limitation is that they process each trajectory in an entirely sequential manner, which prevents them from revising earlier decisions about sub-routine boundary points in light of new incoming information. In this work we propose SloTTAr, a fully parallel approach that integrates sequence processing Transformers with a Slot Attention module and adaptive computation for learning about the number of such sub-routines in an unsupervised fashion. We demonstrate how SloTTAr is capable of outperforming strong baselines in terms of boundary point discovery, even for sequences containing variable amounts of sub-routines, while being up to 7x faster to train on existing benchmarks.

View on arXiv PDF Code

Similar