LGAIMAMLSep 17, 2019

Emergent Tool Use From Multi-Agent Autocurricula

arXiv:1909.07528v2756 citations
AI Analysis

This work addresses the challenge of enabling AI agents to acquire human-relevant skills through self-supervised learning, offering a scalable approach for multi-agent environments, though it is incremental in building on existing reinforcement learning methods.

The study tackled the problem of developing sophisticated tool use and coordination in AI agents through multi-agent competition in hide-and-seek, resulting in agents learning emergent strategies like building shelters and using ramps, with evidence of six distinct phases and improved performance in domain-specific tests compared to baselines.

Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of which creates a new pressure for the opposing team to adapt; for instance, agents learn to build multi-object shelters using moveable boxes which in turn leads to agents discovering that they can overcome obstacles using ramps. We further provide evidence that multi-agent competition may scale better with increasing environment complexity and leads to behavior that centers around far more human-relevant skills than other self-supervised reinforcement learning methods such as intrinsic motivation. Finally, we propose transfer and fine-tuning as a way to quantitatively evaluate targeted capabilities, and we compare hide-and-seek agents to both intrinsic motivation and random initialization baselines in a suite of domain-specific intelligence tests.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes