A Theory of Time-Sensitive Language Generation: Sparse Hallucination Beats Mode Collapse
It provides theoretical foundations for time-sensitive language generation, addressing a fundamental trade-off between consistency and timeliness for the NLP community.
The paper proves that timely generation of strings under a global preference ordering is impossible for eventually consistent generators, but achievable with a vanishing hallucination rate for superlinear deadlines, and shows this is tight by ruling out linear deadlines.
We study language generation in the limit under a global preference ordering on strings, as introduced by Kleinberg and Wei. As in [arXiv:2504.14370, arXiv:2511.05295], we aim for \emph{breadth}, but impose an additional requirement of timeliness: higher-ranked strings should be generated earlier. A string is then only credited if it is generated before a deadline, where its deadline is defined by a function that maps a string's rank in the target language to the time by which it must be produced. This is in keeping with a central consideration in machine learning, where inductive bias favors ``simpler'' or ``more plausible'' outputs, all else being equal. We show that timely generation is impossible in a strong sense for eventually consistent generators -- the protagonists of most prior related work. Under what is perhaps the mildest natural relaxation of consistency, a hallucination rate that vanishes over time, we show that we can circumvent our impossibility result. In particular, we can achieve optimal density with respect to any superlinear deadline function. We also show this is tight by ruling out timely generation with linear deadlines and vanishing hallucination rate.