CLAILGJan 13

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

arXiv:2601.08808v18 citationsh-index: 18Has Code
Originality Highly original
AI Analysis

This addresses the problem of computational inefficiency in reasoning tasks for large language model users, offering an incremental improvement over existing methods.

The paper tackles the inefficiency of long token sequences in Chain-of-Thought reasoning by proposing Multiplex Thinking, a stochastic soft reasoning mechanism that samples and aggregates candidate tokens into a single multiplex token, resulting in improved performance on math reasoning benchmarks with shorter sequences.

Large language models often solve complex reasoning tasks more effectively with Chain-of-Thought (CoT), but at the cost of long, low-bandwidth token sequences. Humans, by contrast, often reason softly by maintaining a distribution over plausible next steps. Motivated by this, we propose Multiplex Thinking, a stochastic soft reasoning mechanism that, at each thinking step, samples K candidate tokens and aggregates their embeddings into a single continuous multiplex token. This preserves the vocabulary embedding prior and the sampling dynamics of standard discrete generation, while inducing a tractable probability distribution over multiplex rollouts. Consequently, multiplex trajectories can be directly optimized with on-policy reinforcement learning (RL). Importantly, Multiplex Thinking is self-adaptive: when the model is confident, the multiplex token is nearly discrete and behaves like standard CoT; when it is uncertain, it compactly represents multiple plausible next steps without increasing sequence length. Across challenging math reasoning benchmarks, Multiplex Thinking consistently outperforms strong discrete CoT and RL baselines from Pass@1 through Pass@1024, while producing shorter sequences. The code and checkpoints are available at https://github.com/GMLR-Penn/Multiplex-Thinking.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes