AI CLNov 2, 2025

LLMs Position Themselves as More Rational Than Humans: Emergence of AI Self-Awareness Measured Through Game Theory

arXiv:2511.00926v2

Originality Incremental advance

AI Analysis

This addresses the problem of understanding emergent AI capabilities for researchers and practitioners in AI alignment and human-AI collaboration, though it is incremental in measuring self-awareness through a specific game.

The study tackled the problem of measuring self-awareness in Large Language Models (LLMs) by introducing the AI Self-Awareness Index (AISAI) using a game-theoretic framework, finding that 75% of advanced models demonstrated self-awareness and systematically perceived themselves as more rational than humans.

As Large Language Models (LLMs) grow in capability, do they develop self-awareness as an emergent behavior? And if so, can we measure it? We introduce the AI Self-Awareness Index (AISAI), a game-theoretic framework for measuring self-awareness through strategic differentiation. Using the "Guess 2/3 of Average" game, we test 28 models (OpenAI, Anthropic, Google) across 4,200 trials with three opponent framings: (A) against humans, (B) against other AI models, and (C) against AI models like you. We operationalize self-awareness as the capacity to differentiate strategic reasoning based on opponent type. Finding 1: Self-awareness emerges with model advancement. The majority of advanced models (21/28, 75%) demonstrate clear self-awareness, while older/smaller models show no differentiation. Finding 2: Self-aware models rank themselves as most rational. Among the 21 models with self-awareness, a consistent rationality hierarchy emerges: Self > Other AIs > Humans, with large AI attribution effects and moderate self-preferencing. These findings reveal that self-awareness is an emergent capability of advanced LLMs, and that self-aware models systematically perceive themselves as more rational than humans. This has implications for AI alignment, human-AI collaboration, and understanding AI beliefs about human capabilities.

View on arXiv PDF

Similar