Agent-based imitation dynamics can yield efficiently compressed population-level vocabularies
This work addresses the mechanistic basis for vocabulary evolution in linguistics and AI, though it is incremental as it builds on existing theories without broad SOTA impact.
The study tackled the problem of how natural languages evolve to efficiently compress meanings into words by integrating evolutionary game theory with the Information Bottleneck framework, showing that near-optimal compression can arise through imprecise strategy imitation in signaling games, with key parameters constraining tradeoffs in emergent vocabularies.
Natural languages have been argued to evolve under pressure to efficiently compress meanings into words by optimizing the Information Bottleneck (IB) complexity-accuracy tradeoff. However, the underlying social dynamics that could drive the optimization of a language's vocabulary towards efficiency remain largely unknown. In parallel, evolutionary game theory has been invoked to explain the emergence of language from rudimentary agent-level dynamics, but it has not yet been tested whether such an approach can lead to efficient compression in the IB sense. Here, we provide a unified model integrating evolutionary game theory with the IB framework and show how near-optimal compression can arise in a population through an independently motivated dynamic of imprecise strategy imitation in signaling games. We find that key parameters of the model -- namely, those that regulate precision in these games, as well as players' tendency to confuse similar states -- lead to constrained variation of the tradeoffs achieved by emergent vocabularies. Our results suggest that evolutionary game dynamics could potentially provide a mechanistic basis for the evolution of vocabularies with information-theoretically optimal and empirically attested properties.