LG AIMay 9

The Pokémon Theorem and other Fairness Impossibility Results

arXiv:2605.0922148.9

AI Analysis

Provides a unifying theoretical framework for fairness impossibilities, offering rigorous bounds and trade-offs for practitioners in algorithmic fairness.

The paper shows that fairness impossibility results share a common RKHS geometry, proving that under unequal base rates, linear mean-fairness constraints are overdetermined. This yields the Pokémon theorem (residual violation decays at Kolmogorov m-width rate), an impossibility for fair feature learning (class collapse), and trade-off frontiers validated on benchmarks.

Fairness impossibility results often look like distinct scalar incompatibility statements. We show that several share one RKHS geometry: fairness criteria are linear constraints on conditional mean embeddings, and unequal base rates make the law of total expectation overdetermine those constraints. This view yields four results. The Kleinberg--Mullainathan--Raghavan dichotomy needs only group-conditional unbiasedness, not full calibration. The \emph{Pokémon theorem} shows that a distinct group pair satisfying any finite collection of linear mean-fairness criteria leaves a residual violation witnessed by the MMD, decaying at the Kolmogorov $m$-width rate under spectral regularity. The same tools prove an impossibility for fair feature learning: parity and class-conditional separation in representation space force class collapse under unequal base rates. The approximate relaxations yield signal and error frontiers, allowing a trade-off between real-world estimators and fairness goals. Experiments on standard fairness benchmarks are consistent with our bounds.

View on arXiv PDF

Similar