LG ROApr 7, 2021

On the Critical Role of Conventions in Adaptive Human-AI Collaboration

Andy Shih, Arjun Sawhney, Jovana Kondic, Stefano Ermon, Dorsa Sadigh

arXiv:2104.02871v118.646 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving human-AI collaboration by leveraging conventions, which could enhance adaptability in domains like gaming or robotics, though it appears incremental as it builds on existing representation learning methods.

The paper tackles the problem of enabling AI agents to adapt quickly to new human partners and tasks by distinguishing between task rules and partner-specific conventions, showing that their agents achieve zero-shot coordination with old partners on new tasks and adapt quickly to new partners across three collaborative tasks.

Humans can quickly adapt to new partners in collaborative tasks (e.g. playing basketball), because they understand which fundamental skills of the task (e.g. how to dribble, how to shoot) carry over across new partners. Humans can also quickly adapt to similar tasks with the same partners by carrying over conventions that they have developed (e.g. raising hand signals pass the ball), without learning to coordinate from scratch. To collaborate seamlessly with humans, AI agents should adapt quickly to new partners and new tasks as well. However, current approaches have not attempted to distinguish between the complexities intrinsic to a task and the conventions used by a partner, and more generally there has been little focus on leveraging conventions for adapting to new settings. In this work, we propose a learning framework that teases apart rule-dependent representation from convention-dependent representation in a principled way. We show that, under some assumptions, our rule-dependent representation is a sufficient statistic of the distribution over best-response strategies across partners. Using this separation of representations, our agents are able to adapt quickly to new partners, and to coordinate with old partners on new tasks in a zero-shot manner. We experimentally validate our approach on three collaborative tasks varying in complexity: a contextual multi-armed bandit, a block placing task, and the card game Hanabi.

View on arXiv PDF Code

Similar