AIGTLGNov 6, 2024

Policy Aggregation

U of Toronto
arXiv:2411.03651v111 citationsh-index: 62Has CodeNIPS
Originality Synthesis-oriented
AI Analysis

This addresses the problem of aligning AI with diverse human values, which is incremental as it adapts existing social choice theory to a new context.

The paper tackles the challenge of AI value alignment with multiple individuals having different reward functions by formalizing it as policy aggregation, demonstrating that social choice methods like approval voting and Borda count can be practically applied to identify a desirable collective policy.

We consider the challenge of AI value alignment with multiple individuals that have different reward functions and optimal policies in an underlying Markov decision process. We formalize this problem as one of policy aggregation, where the goal is to identify a desirable collective policy. We argue that an approach informed by social choice theory is especially suitable. Our key insight is that social choice methods can be reinterpreted by identifying ordinal preferences with volumes of subsets of the state-action occupancy polytope. Building on this insight, we demonstrate that a variety of methods--including approval voting, Borda count, the proportional veto core, and quantile fairness--can be practically applied to policy aggregation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes