AIMay 9

When Can Human-AI Teams Outperform Individuals? Tight Bounds with Impossibility Guarantees

arXiv:2605.0871023.8
AI Analysis

Provides the first rigorous theoretical framework explaining why human-AI complementarity is rare in practice, offering actionable design formulas for confidence-based aggregation.

The paper derives tight theoretical bounds for when human-AI teams can outperform their best individual member, showing complementarity is possible only when error correlation is below a threshold. The theory matches observed team accuracy with R=0.94 on ImageNet-16H and R=0.91 on CIFAR-10H.

Human-AI teams fail to outperform their best member in 70% of studies, yet no theory specifies when complementarity is achievable. We derive tight bounds for the broad class of confidence-based aggregation rules by integrating signal detection theory with information-theoretic analysis, yielding four results: (1) a complementarity theorem (teams outperform individuals iff error correlation $ρ_{HM} < ρ^*$, with $ρ^* \approx a$ in the symmetric near-chance regime); (2) minimax bounds showing gains scale as $Θ(\sqrt{Δd})$ with metacognitive sensitivity difference; (3) an impossibility result proving no confidence-based aggregation rule achieves complementarity when $ρ_{HM} \geq ρ^*$; and (4) multi-class generalization $ρ^*_K \approx ρ^*/\sqrt{K-1}$. Predictions match observed team accuracy ($R = 0.94$ on ImageNet-16H, $R = 0.91$ on CIFAR-10H) and the multi-class threshold scaling holds on human data ($R = 0.93$, $K = 16$), with robustness under non-Gaussian distributions. The framework explains why complementarity is rare and provides actionable design formulas; results apply to aggregation, not to interactive deliberation that generates novel answers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes