LGAICVGRMay 22, 2024

ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

arXiv:2405.13729v23 citationsh-index: 24
Originality Incremental advance
AI Analysis

This addresses a bottleneck in diffusion models for structured generation tasks, offering incremental improvements in training efficiency and control.

The paper tackles the problem of insufficient sampling of combinatorial structures in diffusion generative models, which degrades test-time performance, and presents ComboStoc, a simple fix that accelerates training across images and 3D shapes and enables new test-time generation with varying control over dimensions and attributes.

In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, there are additional attributes which are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes is insufficiently sampled by existing training scheme of diffusion generative models, causing degraded test time performance. We present a simple fix to this problem by constructing stochastic processes that fully exploit the combinatorial structures, hence the name ComboStoc. Using this simple strategy, we show that network training is significantly accelerated across diverse data modalities, including images and 3D structured shapes. Moreover, ComboStoc enables a new way of test time generation which uses insynchronized time steps for different dimensions and attributes, thus allowing for varying degrees of control over them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes