AICYOct 17, 2025

From Checklists to Clusters: A Homeostatic Account of AGI Evaluation

arXiv:2510.15236v1h-index: 1
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately assessing AGI for researchers and developers, offering incremental improvements to existing evaluation frameworks.

The paper tackles the problem of evaluating artificial general intelligence (AGI) by critiquing current methods for using symmetric weights and snapshot scores, which fail to account for domain importance and capability durability, and proposes a homeostatic account that weights domains by causal centrality and measures persistence across sessions to reduce brittleness and gaming.

Contemporary AGI evaluations report multidomain capability profiles, yet they typically assign symmetric weights and rely on snapshot scores. This creates two problems: (i) equal weighting treats all domains as equally important when human intelligence research suggests otherwise, and (ii) snapshot testing can't distinguish durable capabilities from brittle performances that collapse under delay or stress. I argue that general intelligence -- in humans and potentially in machines -- is better understood as a homeostatic property cluster: a set of abilities plus the mechanisms that keep those abilities co-present under perturbation. On this view, AGI evaluation should weight domains by their causal centrality (their contribution to cluster stability) and require evidence of persistence across sessions. I propose two battery-compatible extensions: a centrality-prior score that imports CHC-derived weights with transparent sensitivity analysis, and a Cluster Stability Index family that separates profile persistence, durable learning, and error correction. These additions preserve multidomain breadth while reducing brittleness and gaming. I close with testable predictions and black-box protocols labs can adopt without architectural access.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes