Escaping Mode Collapse in LLM Generation via Geometric Regulation
For practitioners of LLM generation, this provides a practical method to maintain diversity and quality even at very low entropy settings.
Mode collapse in LLM generation is reinterpreted as geometric collapse in representation space, and a lightweight intervention (RMR) reduces collapse, enabling stable generation at entropy rates as low as 0.8 nats/step compared to standard decoding's collapse near 2.0 nats/step.
Mode collapse is a persistent challenge in generative modeling and appears in autoregressive text generation as behaviors ranging from explicit looping to gradual loss of diversity and premature trajectory convergence. We take a dynamical-systems view and reinterpret mode collapse as reduced state-space accessibility caused by *geometric collapse*: during generation, the model's internal trajectory becomes confined to a low-dimensional region of its representation space. This implies mode collapse is not purely a token-level phenomenon and cannot be reliably solved by symbolic constraints or probability-only decoding heuristics. Guided by this perspective, we propose *Reinforced Mode Regulation* (RMR), a lightweight, online state-space intervention that regulates dominant self-reinforcing directions in the Transformer value cache (implemented as low-rank damping). Across multiple large language models, RMR substantially reduces mode collapse and enables stable, high-quality generation at extremely low entropy rates (down to 0.8 nats/step), whereas standard decoding typically collapses near 2.0 nats/step.