HC AIJan 31, 2025

In Pursuit of Predictive Models of Human Preferences Toward AI Teammates

Ho Chit Siu, Jaime D. Peña, Yutai Zhou, Ross E. Allen

arXiv:2503.15516v17.21 citationsh-index: 11

Originality Incremental advance

AI Analysis

This work addresses the challenge of designing AI agents that are preferred by human collaborators, with incremental insights into specific behavioral metrics.

The study tackled the problem of predicting human preferences for AI teammates by correlating objective AI metrics with subjective human evaluations in the Hanabi game, finding that action diversity and strategic dominance were more predictive than final game scores.

We seek measurable properties of AI agents that make them better or worse teammates from the subjective perspective of human collaborators. Our experiments use the cooperative card game Hanabi -- a common benchmark for AI-teaming research. We first evaluate AI agents on a set of objective metrics based on task performance, information theory, and game theory, which are measurable without human interaction. Next, we evaluate subjective human preferences toward AI teammates in a large-scale (N=241) human-AI teaming experiment. Finally, we correlate the AI-only objective metrics with the human subjective preferences. Our results refute common assumptions from prior literature on reinforcement learning, revealing new correlations between AI behaviors and human preferences. We find that the final game score a human-AI team achieves is less predictive of human preferences than esoteric measures of AI action diversity, strategic dominance, and ability to team with other AI. In the future, these correlations may help shape reward functions for training human-collaborative AI.

View on arXiv PDF

Similar