AIApr 14, 2022

Retrospective on the 2021 BASALT Competition on Learning from Human Feedback

Berkeley
arXiv:2204.07123v110 citationsh-index: 18
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of promoting research in LfHF for open-world AI tasks, though it is incremental as it builds on existing competition frameworks.

The authors organized the first MineRL BASALT competition at NeurIPS 2021 to advance agents using learning from human feedback (LfHF) for open-world tasks in Minecraft, resulting in diverse LfHF algorithms from participants but fewer submissions than expected.

We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft, and allowed participants to use any approach they wanted to build agents that could accomplish the tasks. Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types. The three winning teams implemented significantly different approaches while achieving similar performance. Interestingly, their approaches performed well on different tasks, validating our choice of tasks to include in the competition. While the outcomes validated the design of our competition, we did not get as many participants and submissions as our sister competition, MineRL Diamond. We speculate about the causes of this problem and suggest improvements for future iterations of the competition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes