Measuring and Analyzing Intelligence via Contextual Uncertainty in Large Language Models using Information-Theoretic Metrics
This provides a principled way to analyze and compare the internal dynamics of AI systems, addressing a foundational gap in understanding LLM mechanisms.
The paper tackled the problem of understanding how large language models process information by developing a task-agnostic method to build a quantitative Cognitive Profile based on entropy decay curves, revealing stable profiles that depend on model scale and text complexity, and proposing the Information Gain Span as a summary index.
Large Language Models (LLMs) excel on many task-specific benchmarks, yet the mechanisms that drive this success remain poorly understood. We move from asking what these systems can do to asking how they process information. Our contribution is a task-agnostic method that builds a quantitative Cognitive Profile for any model. The profile is built around the Entropy Decay Curve-a plot of a model's normalised predictive uncertainty as context length grows. Across several state-of-the-art LLMs and diverse texts, the curves expose distinctive, stable profiles that depend on both model scale and text complexity. We also propose the Information Gain Span (IGS) as a single index that summarises the desirability of a decay pattern. Together, these tools offer a principled way to analyse and compare the internal dynamics of modern AI systems.