LGMay 4

Break the Block: Dynamic-size Reasoning Blocks for Diffusion Large Language Models via Monotonic Entropy Descent with Reinforcement Learning

arXiv:2605.0226392.61 citationsHas Code
Predicted impact top 6% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For researchers working on diffusion LLMs, this work addresses the bottleneck of fixed block sizes in semi-autoregressive generation, improving reasoning coherence and adaptability.

The paper identifies that fixed-size blocks in diffusion large language models (dLLMs) hinder reasoning coherence and effectiveness, and proposes a post-training framework that learns dynamic-size reasoning blocks via reinforcement learning with a monotonic entropy descent objective, achieving consistent improvements over fixed-size baselines across reasoning benchmarks.

Recent diffusion large language models (dLLMs) have demonstrated both effectiveness and efficiency in reasoning via a block-based semi-autoregressive generation paradigm. Despite their progress, the fixed-size block generations remain a critical bottleneck for effective and coherent reasoning. 1. From a global perspective, different reasoning tasks would correspond to different optimal decoding block sizes, which makes a ``one-size-fits-all'' assumption ineffective. 2. Even within a single reasoning task, the rigid block partitioning would break the logical flow and reduce reasoning coherence. Through empirical observations, we reveal that for block-wise entropy, incorrect reasoning exhibits a fluctuating and unsteady trend between blocks, whereas the correctly generated tasks follow a consistent descending trend. Therefore, this paper proposes b1, a novel post-training framework for dLLMs that learns dynamic-size reasoning blocks via a Monotonic Entropy Descent objective with reinforcement learning to enhance reasoning coherence.b1 integrates seamlessly as a plug-and-play module with existing dLLM's post-training algorithms. Extensive experiments across various reasoning benchmarks showcase b1's consistent improvement over existing fixed-size block baselines. Our code has been released at https://github.com/YanJiangJerry/Block-R1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes