Reversible Diffusion Decoding for Diffusion Language Models
This work addresses a specific bottleneck in diffusion-based language generation for AI applications, offering an incremental improvement to enhance decoding reliability.
The paper tackles the problem of stagnation in diffusion language models during parallel token generation by introducing Reversible Diffusion Decoding (RDD), which enables backtracking and re-masking to recover from errors, resulting in improved generation robustness and quality with minimal computational overhead.
Diffusion language models enable parallel token generation through block-wise decoding, but their irreversible commitments can lead to stagnation, where the reverse diffusion process fails to make further progress under a suboptimal context.We propose Reversible Diffusion Decoding (RDD), a decoding framework that introduces reversibility into block-wise diffusion generation. RDD detects stagnation as a state-dependent failure of the reverse process and enables efficient backtracking to earlier blocks without recomputation via cached model states. To avoid repeated failure trajectories, RDD applies confidence-guided re-masking to selectively reinitialize uncertain tokens while preserving reliable context.This reversible formulation allows decoding to recover from early commitment errors while maintaining the parallel efficiency of diffusion-based generation. Experiments show that RDD improves generation robustness and quality over baselines with minimal computational overhead.