GTAIMAFeb 24, 2024

Cooperation and Control in Delegation Games

arXiv:2402.15821v23 citationsh-index: 13IJCAI
Originality Synthesis-oriented
AI Analysis

This addresses issues in human-machine interaction for applications like personal assistants and autonomous vehicles, but it is incremental in formalizing existing concepts.

The paper tackles the problem of control and cooperation failures in delegation games involving humans and machines, showing theoretically and empirically how alignment and capabilities affect principals' welfare and can be estimated to design better AI systems.

Many settings of interest involving humans and machines -- from virtual personal assistants to autonomous vehicles -- can naturally be modelled as principals (humans) delegating to agents (machines), which then interact with each other on their principals' behalf. We refer to these multi-principal, multi-agent scenarios as delegation games. In such games, there are two important failure modes: problems of control (where an agent fails to act in line their principal's preferences) and problems of cooperation (where the agents fail to work well together). In this paper we formalise and analyse these problems, further breaking them down into issues of alignment (do the players have similar preferences?) and capabilities (how competent are the players at satisfying those preferences?). We show -- theoretically and empirically -- how these measures determine the principals' welfare, how they can be estimated using limited observations, and thus how they might be used to help us design more aligned and cooperative AI systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes