Random Language Model
This addresses the fundamental problem of how structured languages arise in generative systems, which is foundational for understanding language emergence in AI and linguistics.
The paper investigates the emergence of structured language from random weighted context-free grammars, finding a phase transition from noise-like sentences to organized phases that carry nontrivial information as grammar weight distributions broaden.
Many complex generative systems use languages to create structured objects. We consider a model of random languages, defined by weighted context-free grammars. As the distribution of grammar weights broadens, a transition is found from a random phase, in which sentences are indistinguishable from noise, to an organized phase in which nontrivial information is carried. This marks the emergence of deep structure in the language, and can be understood by a competition between energy and entropy.