LGAIDec 6, 2023

Generalization to New Sequential Decision Making Tasks with In-Context Learning

arXiv:2312.03801v141 citationsh-index: 28ICML
Originality Incremental advance
AI Analysis

It addresses the challenge of training autonomous agents to adapt quickly to new tasks with minimal data, which is incremental as it builds on existing in-context learning methods but applies them to a more error-sensitive domain.

The paper tackles the problem of enabling in-context learning for sequential decision-making tasks, showing that naive transformer approaches fail, and by training on diverse offline datasets, their model learns new tasks like MiniHack and Procgen without weight updates from few demonstrations.

Training autonomous agents that can learn new tasks from only a handful of demonstrations is a long-standing problem in machine learning. Recently, transformers have been shown to learn new language or vision tasks without any weight updates from only a few examples, also referred to as in-context learning. However, the sequential decision making setting poses additional challenges having a lower tolerance for errors since the environment's stochasticity or the agent's actions can lead to unseen, and sometimes unrecoverable, states. In this paper, we use an illustrative example to show that naively applying transformers to sequential decision making problems does not enable in-context learning of new tasks. We then demonstrate how training on sequences of trajectories with certain distributional properties leads to in-context learning of new sequential decision making tasks. We investigate different design choices and find that larger model and dataset sizes, as well as more task diversity, environment stochasticity, and trajectory burstiness, all result in better in-context learning of new out-of-distribution tasks. By training on large diverse offline datasets, our model is able to learn new MiniHack and Procgen tasks without any weight updates from just a handful of demonstrations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes