LGAIMay 9

The Pokémon Theorem and other Fairness Impossibility Results

arXiv:2605.0922148.9
AI Analysis

Provides a unifying theoretical framework for fairness impossibilities, offering rigorous bounds and trade-offs for practitioners in algorithmic fairness.

The paper shows that fairness impossibility results share a common RKHS geometry, proving that under unequal base rates, linear mean-fairness constraints are overdetermined. This yields the Pokémon theorem (residual violation decays at Kolmogorov m-width rate), an impossibility for fair feature learning (class collapse), and trade-off frontiers validated on benchmarks.

Fairness impossibility results often look like distinct scalar incompatibility statements. We show that several share one RKHS geometry: fairness criteria are linear constraints on conditional mean embeddings, and unequal base rates make the law of total expectation overdetermine those constraints. This view yields four results. The Kleinberg--Mullainathan--Raghavan dichotomy needs only group-conditional unbiasedness, not full calibration. The \emph{Pokémon theorem} shows that a distinct group pair satisfying any finite collection of linear mean-fairness criteria leaves a residual violation witnessed by the MMD, decaying at the Kolmogorov $m$-width rate under spectral regularity. The same tools prove an impossibility for fair feature learning: parity and class-conditional separation in representation space force class collapse under unequal base rates. The approximate relaxations yield signal and error frontiers, allowing a trade-off between real-world estimators and fairness goals. Experiments on standard fairness benchmarks are consistent with our bounds.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes