Generalized Denoising Diffusion Codebook Models (gDDCM): Tokenizing images using a pre-trained diffusion model
This work addresses a domain-specific problem for researchers in image compression and diffusion models, but it is incremental as it builds directly on existing DDCM.
The paper tackles the limitation of Denoising Diffusion Codebook Models (DDCM) by proposing a generalized version (gDDCM) that extends image compression to mainstream diffusion models and variants, achieving improved performance on CIFAR-10 and LSUN Bedroom datasets.
Recently, the Denoising Diffusion Codebook Models (DDCM) was proposed. DDCM leverages the Denoising Diffusion Probabilistic Model (DDPM) and replaces the random noise in the backward process with noise sampled from specific sets according to a predefined rule, thereby enabling image compression. However, DDCM cannot be applied to methods other than DDPM. In this paper, we propose the generalized Denoising Diffusion Compression Model (gDDCM), which extends DDCM to mainstream diffusion models and their variants, including DDPM, Score-Based Models, Consistency Models, and Rectified Flow. We evaluate our method on CIFAR-10 and LSUN Bedroom datasets. Experimental results demonstrate that our approach successfully generalizes DDCM to the aforementioned models and achieves improved performance.