MLLGAug 6, 2025

The Cosine Schedule is Fisher-Rao-Optimal for Masked Discrete Diffusion Models

arXiv:2508.04884v29 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This provides a theoretical justification for an existing heuristic in diffusion models, which is incremental but clarifies a key design choice for practitioners.

The authors tackled the problem of selecting a discretization schedule for sampling from masked discrete diffusion models, showing that the optimal schedule under Fisher-Rao geometry matches the widely-used cosine schedule.

In this work, we study the problem of choosing the discretisation schedule for sampling from masked discrete diffusion models in terms of the information geometry of the induced probability path. Specifically, we show that the optimal schedule under the Fisher-Rao geometry recovers the popularly-used cosine schedule.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes