CLLGApr 16, 2021

Language Models are Few-Shot Butlers

arXiv:2104.07972v2668 citations
AI Analysis

This work addresses the challenge of reducing demonstration data for language model agents in interactive environments, offering a significant but incremental improvement.

The paper tackles the problem of collecting expert demonstrations in text-based environments by introducing a two-stage procedure that uses a small set of demonstrations and reinforcement learning, achieving a 51% absolute improvement in success rate over existing methods in the ALFWorld environment.

Pretrained language models demonstrate strong performance in most NLP tasks when fine-tuned on small task-specific datasets. Hence, these autoregressive models constitute ideal agents to operate in text-based environments where language understanding and generative capabilities are essential. Nonetheless, collecting expert demonstrations in such environments is a time-consuming endeavour. We introduce a two-stage procedure to learn from a small set of demonstrations and further improve by interacting with an environment. We show that language models fine-tuned with only 1.2% of the expert demonstrations and a simple reinforcement learning algorithm achieve a 51% absolute improvement in success rate over existing methods in the ALFWorld environment.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes