AILGNov 18, 2021

Reinforcement Learning on Human Decision Models for Uniquely Collaborative AI Teammates

arXiv:2111.09800v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of developing uniquely collaborative AI teammates for human-machine teams, though it is incremental as it builds on existing methods for modeling human decision-making.

The researchers tackled the problem of creating an AI agent that excels at the collaborative card game Hanabi with human teammates, achieving a human-play average score of 16.5, which outperformed the state-of-the-art for human-bot scores.

In 2021 the Johns Hopkins University Applied Physics Laboratory held an internal challenge to develop artificially intelligent (AI) agents that could excel at the collaborative card game Hanabi. Agents were evaluated on their ability to play with human players whom the agents had never previously encountered. This study details the development of the agent that won the challenge by achieving a human-play average score of 16.5, outperforming the current state-of-the-art for human-bot Hanabi scores. The winning agent's development consisted of observing and accurately modeling the author's decision making in Hanabi, then training with a behavioral clone of the author. Notably, the agent discovered a human-complementary play style by first mimicking human decision making, then exploring variations to the human-like strategy that led to higher simulated human-bot scores. This work examines in detail the design and implementation of this human compatible Hanabi teammate, as well as the existence and implications of human-complementary strategies and how they may be explored for more successful applications of AI in human machine teams.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes