Capturing AI's Attention: Physics of Repetition, Hallucination, Bias and Beyond

arXiv:2504.04600v15 citationsh-index: 3
Originality Highly original
AI Analysis

This foundational theory could help society address trust and resilience in AI by leveraging physics expertise, though it appears incremental as an extension of existing attention concepts.

The authors developed a first-principles physics theory for the Attention mechanism in LLMs, enabling quantitative analysis of issues like repetition, hallucination, and bias, with predictions validated against large-scale LLM outputs.

We derive a first-principles physics theory of the AI engine at the heart of LLMs' 'magic' (e.g. ChatGPT, Claude): the basic Attention head. The theory allows a quantitative analysis of outstanding AI challenges such as output repetition, hallucination and harmful content, and bias (e.g. from training and fine-tuning). Its predictions are consistent with large-scale LLM outputs. Its 2-body form suggests why LLMs work so well, but hints that a generalized 3-body Attention would make such AI work even better. Its similarity to a spin-bath means that existing Physics expertise could immediately be harnessed to help Society ensure AI is trustworthy and resilient to manipulation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes