TUBE: Tangent Upper Bound on Evidence for Discrete Diffusion Language Models
Provides a rigorous evaluation tool for discrete diffusion models, showing that autoregressive models remain superior in likelihood, which is important for practitioners choosing generative models.
The paper introduces TUBE, a variational upper bound on log-likelihood for discrete diffusion models, enabling unbiased estimation. Applied to block masked diffusion models and block any-order autoregressive models, TUBE reveals that autoregressive models still achieve higher likelihoods.
Log-likelihood is a standard metric for evaluating generative models. Unfortunately, in contrast to autoregressive models (ARMs), discrete diffusion models generally do not admit exact computation of this quantity. Existing evaluations, therefore, rely on the evidence lower bound (ELBO), leaving unclear how much higher the true value may be. We address this by introducing the Tangent Upper Bound on Evidence (TUBE), a variational upper bound on log-likelihood that admits an unbiased Monte Carlo estimator. Our TUBE extends across latent-variable models, including masked diffusion models (MDMs), any-order ARMs (AO-ARMs), and block variants of both. Applied to block MDMs and block AO-ARMs, TUBE reveals our key empirical finding that these models lie strictly below the exact ARM baseline, showing that ARMs still dominate in likelihood.