AICLLGOTFeb 20, 2025

A Statistical Case Against Empirical Human-AI Alignment

arXiv:2502.14581v28 citationsh-index: 9
Originality Synthesis-oriented
AI Analysis

This addresses potential biases in aligning AI with human behavior, which is an incremental concern for AI safety and ethics researchers.

The paper argues that empirical human-AI alignment can introduce statistical biases, advocating against it and proposing alternatives like prescriptive and a posteriori alignment, with examples such as human-centric decoding of language models.

Empirical human-AI alignment aims to make AI systems act in line with observed human behavior. While noble in its goals, we argue that empirical alignment can inadvertently introduce statistical biases that warrant caution. This position paper thus advocates against naive empirical alignment, offering prescriptive alignment and a posteriori empirical alignment as alternatives. We substantiate our principled argument by tangible examples like human-centric decoding of language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes