Information Gravity: A Field-Theoretic Model for Token Selection in Large Language Models
This work addresses the challenge of interpreting LLM behavior for researchers and practitioners, but it is incremental as it offers a new theoretical framework without experimental validation.
The authors tackled the problem of understanding text generation in large language models by proposing a theoretical 'information gravity' model based on field theory and spacetime geometry, which explains phenomena like hallucinations, query sensitivity, and temperature effects without providing concrete numerical results.
We propose a theoretical model called "information gravity" to describe the text generation process in large language models (LLMs). The model uses physical apparatus from field theory and spacetime geometry to formalize the interaction between user queries and the probability distribution of generated tokens. A query is viewed as an object with "information mass" that curves the semantic space of the model, creating gravitational potential wells that "attract" tokens during generation. This model offers a mechanism to explain several observed phenomena in LLM behavior, including hallucinations (emerging from low-density semantic voids), sensitivity to query formulation (due to semantic field curvature changes), and the influence of sampling temperature on output diversity.