CVNov 24, 2025

CoD: A Diffusion Foundation Model for Image Compression

arXiv:2511.18706v24 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient diffusion-based image compression, particularly at ultra-low bitrates, though it appears incremental as it builds on existing diffusion codec frameworks.

The paper tackles the problem of suboptimal text conditioning in diffusion-based image compression by introducing CoD, a compression-oriented diffusion foundation model trained from scratch, which achieves state-of-the-art results at ultra-low bitrates (e.g., 0.0039 bpp) and reduces training time by 300× compared to Stable Diffusion.

Existing diffusion codecs typically build on text-to-image diffusion foundation models like Stable Diffusion. However, text conditioning is suboptimal from a compression perspective, hindering the potential of downstream diffusion codecs, particularly at ultra-low bitrates. To address it, we introduce \textbf{CoD}, the first \textbf{Co}mpression-oriented \textbf{D}iffusion foundation model, trained from scratch to enable end-to-end optimization of both compression and generation. CoD is not a fixed codec but a general foundation model designed for various diffusion-based codecs. It offers several advantages: \textbf{High compression efficiency}, replacing Stable Diffusion with CoD in downstream codecs like DiffC achieves SOTA results, especially at ultra-low bitrates (e.g., 0.0039 bpp); \textbf{Low-cost and reproducible training}, 300$\times$ faster training than Stable Diffusion ($\sim$ 20 vs. $\sim$ 6,250 A100 GPU days) on entirely open image-only datasets; \textbf{Providing new insights}, e.g., We find pixel-space diffusion can achieve VTM-level PSNR with high perceptual quality and can outperform GAN-based codecs using fewer parameters. We hope CoD lays the foundation for future diffusion codec research. Codes will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes