Improving Latent Reasoning in LLMs via Soft Concept Mixing
This addresses a bottleneck in LLM reasoning for AI applications, but it is incremental as it builds directly on prior work like Soft Thinking.
The paper tackles the gap between LLMs' discrete token training and latent reasoning via soft concepts by proposing Soft Concept Mixing (SCM), a training scheme that mixes soft concept vectors into hidden states and optimizes with RL, resulting in improved reasoning performance on five benchmarks.
Unlike human reasoning in abstract conceptual spaces, large language models (LLMs) typically reason by generating discrete tokens, which potentially limit their expressive power. The recent work Soft Thinking has shown that LLMs' latent reasoning via soft concepts is a promising direction, but LLMs are trained on discrete tokens. To reduce this gap between the soft concepts in reasoning and the discrete tokens in training, we propose Soft Concept Mixing (SCM), a soft concept aware training scheme that directly exposes the model to soft representations during training. Specifically, SCM constructs a soft concept vector by forming a probability-weighted average of embeddings. Then, this vector is mixed into the model's hidden states, which embody rich contextual information. Finally, the entire latent reasoning process is optimized with Reinforcement Learning (RL). Experiments on five reasoning benchmarks demonstrate that SCM improves the reasoning performance of LLMs, and simultaneously maintains a stable training dynamic.