The Generalized Turing Test: A Foundation for Comparing Intelligence

Daniel Mitropolsky, Susan S. Hong, Riccardo Neumarker, Emanuele Rimoldi, Tomaso Poggio

arXiv:2605.1085186.9

Predicted impact top 25% in AI · last 90 daysOriginality Highly original

AI Analysis

It provides a dataset- and task-agnostic foundation for comparing intelligence, potentially unifying evaluation across AI systems.

The paper introduces the Generalized Turing Test (GTT), a formal framework for comparing agent intelligence via indistinguishability, and demonstrates through empirical evaluation that it yields stratified orderings consistent with existing rankings.

We introduce the Generalized Turing Test (GTT), a formal framework for comparing the capabilities of arbitrary agents via indistinguishability. For agents A and B, we define the Turing comparator A $\geq$ B to hold if B, acting as a distinguisher, cannot reliably distinguish between interactions with A (instructed to imitate B) and another instance of B. This yields a dataset- and task-agnostic notion of relative intelligence. We study the comparator's structure, including conditions under which it is transitive and therefore induces an ordering over equivalence classes, and we define and analyze variants with querying, bounded interaction, and fixed distinguishers. To complement the theory, we instantiate the framework on a collection of modern models, empirically evaluating pairwise indistinguishability across thousands of trials. The resulting comparisons exhibit a stratified structure consistent with existing rankings, hinting that the proposed framework yields meaningful empirical orderings. Our results position indistinguishability as a unifying lens for reasoning about intelligence, suggesting a foundation for evaluation and, potentially, training objectives that are inherently independent of fixed datasets or benchmarks.

View on arXiv PDF

Similar