LGAIFeb 19, 2025

Interleaved Gibbs Diffusion: Generating Discrete-Continuous Data with Implicit Constraints

arXiv:2502.13450v2h-index: 15
Originality Highly original
AI Analysis

This addresses the challenge of modeling complex dependencies in discrete-continuous data for applications like drug discovery and design, representing a novel method rather than an incremental improvement.

The paper tackles the problem of generating discrete-continuous data with implicit constraints by introducing Interleaved Gibbs Diffusion (IGD), a novel generative modeling framework that avoids factorized denoising distributions to better capture dependencies, achieving state-of-the-art performance on tasks like molecule structures, layouts, and tabular data.

We introduce Interleaved Gibbs Diffusion (IGD), a novel generative modeling framework for discrete-continuous data, focusing on problems with important, implicit and unspecified constraints in the data. Most prior works on discrete and discrete-continuous diffusion assume a factorized denoising distribution, which can hinder the modeling of strong dependencies between random variables in such problems. We empirically demonstrate a significant improvement in 3-SAT performance out of the box by switching to a Gibbs-sampling style discrete diffusion model which does not assume factorizability. Motivated by this, we introduce IGD which generalizes discrete time Gibbs sampling type Markov chain for the case of discrete-continuous generation. IGD allows for seamless integration between discrete and continuous denoisers while theoretically guaranteeing exact reversal of a suitable forward process. Further, it provides flexibility in the choice of denoisers, allows conditional generation via state-space doubling and inference time refinement. Empirical evaluations on three challenging generation tasks - molecule structures, layouts and tabular data - demonstrate state-of-the-art performance. Notably, IGD achieves state-of-the-art results without relying on domain-specific inductive biases like equivariant diffusion or auxiliary losses. We explore a wide range of modeling, and interleaving strategies along with hyperparameters in each of these problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes