CLLGApr 4, 2025

Language Models Are Implicitly Continuous

Oxford
arXiv:2504.03933v17 citationsh-index: 11Has CodeICLR
Originality Highly original
AI Analysis

This challenges the traditional interpretation of how LLMs understand language, with implications for linguistics and engineering.

The paper shows that Transformer-based language models implicitly represent sentences as continuous-time functions over a continuous input space, a phenomenon observed in state-of-the-art LLMs like Llama2 and Mistral, which suggests they reason about language differently from humans.

Language is typically modelled with discrete sequences. However, the most successful approaches to language modelling, namely neural networks, are continuous and smooth function approximators. In this work, we show that Transformer-based language models implicitly learn to represent sentences as continuous-time functions defined over a continuous input space. This phenomenon occurs in most state-of-the-art Large Language Models (LLMs), including Llama2, Llama3, Phi3, Gemma, Gemma2, and Mistral, and suggests that LLMs reason about language in ways that fundamentally differ from humans. Our work formally extends Transformers to capture the nuances of time and space continuity in both input and output space. Our results challenge the traditional interpretation of how LLMs understand language, with several linguistic and engineering implications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes