CVMay 26, 2025

ReDDiT: Rehashing Noise for Discrete Visual Generation

Tianren Ma, Xiaosong Zhang, Boyu Yang, Junlan Feng, Qixiang Ye

arXiv:2505.19656v310.22 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in discrete visual generation for AI researchers, offering an incremental improvement over existing methods.

The paper tackled the problem of discrete diffusion models underperforming compared to continuous ones in visual generation by proposing ReDDiT, a rehashing noise approach that extends absorbing states and improves expressive capacity, resulting in a significant reduction in gFID from 6.18 to 1.61 and competitive performance with continuous models.

In the visual generative area, discrete diffusion models are gaining traction for their efficiency and compatibility. However, pioneered attempts still fall behind their continuous counterparts, which we attribute to noise (absorbing state) design and sampling heuristics. In this study, we propose a rehashing noise approach for discrete diffusion transformer (termed ReDDiT), with the aim to extend absorbing states and improve expressive capacity of discrete diffusion models. ReDDiT enriches the potential paths that latent variables traverse during training with randomized multi-index corruption. The derived rehash sampler, which reverses the randomized absorbing paths, guarantees high diversity and low discrepancy of the generation process. These reformulations lead to more consistent and competitive generation quality, mitigating the need for heavily tuned randomness. Experiments show that ReDDiT significantly outperforms the baseline model (reducing gFID from 6.18 to 1.61) and is on par with the continuous counterparts.

View on arXiv PDF

Similar