LGSep 27, 2025

Planner Aware Path Learning in Diffusion Language Models Training

arXiv:2509.23405v110 citationsh-index: 17
Originality Highly original
AI Analysis

This work addresses a key limitation in diffusion language models for researchers and practitioners, offering a method to align training with inference for better performance across domains like text and code generation.

The paper tackles the mismatch between training and inference paths in diffusion language models by introducing a new training objective that incorporates planner-based reverse dynamics, resulting in significant improvements such as a 40% gain in protein modeling and up to 4x better text generation quality.

Diffusion language models have emerged as a powerful alternative to autoregressive models, enabling fast inference through flexible and parallel generation paths. This flexibility is enabled by new sampling strategies, or planners, that iteratively choose where to denoise along the sequence rather than sampling uniformly at random. However, by modifying reverse paths, planners introduce a mismatch between the uniformly random denoising paths used during training and the planning-based paths used at inference. In this work, we systematically investigate this mismatch and theoretically show that the standard discrete diffusion training evidence lower bound (ELBO) does not accurately describe a denoiser under non-uniform planning. To bridge this gap, we derive a new Planned Evidence Lower Bound (P-ELBO) that directly incorporates planner-based reverse dynamics into the training objective. Building on this, we propose Planner Aware Path Learning (PAPL), a simple and effective modification of the standard masked discrete diffusion loss that aligns training and inference under planned denoisers. Empirically, PAPL delivers consistent improvements across domains, including a 40% relative gain in protein sequence modeling, up to a 4x improvement in MAUVE for text generation, and a 23% relative gain in HumanEval pass@10 for code generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes