ML CV LGOct 2, 2023

Mirror Diffusion Models for Constrained and Watermarked Generation

Guan-Horng Liu, Tianrong Chen, Evangelos A. Theodorou, Molei Tao

arXiv:2310.01236v224.444 citationsh-index: 47Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of tractable diffusion modeling for constrained data generation, offering incremental algorithmic improvements with potential applications in domains requiring safety and privacy.

The authors tackled the problem of generating data on constrained sets using diffusion models, which previously lost tractability, by proposing Mirror Diffusion Models (MDM) that learn diffusion in a dual Euclidean space, achieving significantly improved performance over existing methods. They demonstrated this with efficient computations for sets like simplices and ℓ₂-balls and explored applications in watermarking for safety and privacy.

Modern successes of diffusion models in learning complex, high-dimensional data distributions are attributed, in part, to their capability to construct diffusion processes with analytic transition kernels and score functions. The tractability results in a simulation-free framework with stable regression losses, from which reversed, generative processes can be learned at scale. However, when data is confined to a constrained set as opposed to a standard Euclidean space, these desirable characteristics appear to be lost based on prior attempts. In this work, we propose Mirror Diffusion Models (MDM), a new class of diffusion models that generate data on convex constrained sets without losing any tractability. This is achieved by learning diffusion processes in a dual space constructed from a mirror map, which, crucially, is a standard Euclidean space. We derive efficient computation of mirror maps for popular constrained sets, such as simplices and $\ell_2$-balls, showing significantly improved performance of MDM over existing methods. For safety and privacy purposes, we also explore constrained sets as a new mechanism to embed invisible but quantitative information (i.e., watermarks) in generated data, for which MDM serves as a compelling approach. Our work brings new algorithmic opportunities for learning tractable diffusion on complex domains. Our code is available at https://github.com/ghliu/mdm

View on arXiv PDF Code

Similar