A Coherence-Based Measure of AGI
This provides a stricter, interpretable foundation for AGI measurement, addressing a key problem for AI researchers and developers, though it is incremental as it builds on existing formalizations.
The paper tackles the problem of measuring Artificial General Intelligence (AGI) by proposing a coherence-aware measure based on the integral of generalized means over compensability exponents, which penalizes imbalance across cognitive domains. Applied to GPT-4 and GPT-5, it shows that both systems remain far from general competence despite high arithmetic scores, with GPT-5 at 24%.
Recent work by \citet{hendrycks2025agidefinition} formalized \textit{Artificial General Intelligence} (AGI) as the arithmetic mean of proficiencies across cognitive domains derived from the Cattell--Horn--Carroll (CHC) model of human cognition. While elegant, this definition assumes \textit{compensability} -- that exceptional ability in some domains can offset failure in others. True general intelligence, however, should reflect \textit{coherent sufficiency}: balanced competence across all essential domains. We propose a coherence-aware measure of AGI based on the integral of generalized means over a continuum of compensability exponents. This formulation spans arithmetic, geometric, and harmonic regimes, and the resulting \textit{area under the curve} (AUC) quantifies robustness under varying compensability assumptions. Unlike the arithmetic mean, which rewards specialization, the AUC penalizes imbalance and captures inter-domain dependency. Applied to published CHC-based domain scores for GPT-4 and GPT-5, the coherence-adjusted AUC reveals that both systems remain far from general competence despite high arithmetic scores (e.g., GPT-5 at~24\%). Integrating the generalized mean thus yields a principled, interpretable, and stricter foundation for measuring genuine progress toward AGI.