CL LGMay 19

Drifting Objectives for Refining Discrete Diffusion Language Models

Daisuke Oba, Hiroki Furuta, Naoaki Okazaki

arXiv:2605.1947090.1

Predicted impact top 36% in CL · last 90 daysOriginality Incremental advance

AI Analysis

This work provides a practical refinement objective for discrete diffusion language models, addressing the challenge of non-differentiable discrete tokens.

TokenDrift improves discrete diffusion language models by applying anti-symmetric drifting to soft-token features, reducing generation perplexity at 4 NFEs by 89% on MDLM and 86% on DUO.

Discrete diffusion language models (DDLMs) generate text by iteratively denoising categorical token sequences, while recent drifting methods for continuous generators suggest that part of this sampling-time correction can instead be absorbed into training through an anti-symmetric fixed-point objective. We study how to transfer this principle to DDLMs, where the main challenge is the interface with discrete text: hard token samples are non-differentiable, and categorical predictions do not directly provide continuous samples to drift. We formulate TokenDrift, a drifting objective that lifts categorical predictions to soft-token features, applies anti-symmetric drifting in a frozen semantic space, and backpropagates the resulting stop-gradient feature target to DDLM logits. In controlled continual-training experiments with masked and uniform-state diffusion backbones, TokenDrift improves fixed-NFE generation quality over matched continuation baselines, reducing Gen.-PPL at 4 NFEs by 89% on MDLM and 86% on DUO. These results suggest that drifting can provide a practical refinement objective for DDLMs.

View on arXiv PDF

Similar