CVIVMay 22, 2025

One-Step Diffusion-Based Image Compression with Semantic Distillation

arXiv:2505.16687v115 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses the decoding speed problem for users of generative image compression, though it is incremental as it builds on existing diffusion-based methods.

The paper tackles the latency issue in diffusion-based image compression by proposing a one-step diffusion codec, achieving over 40% bitrate reduction and 20x faster decoding while maintaining state-of-the-art perceptual quality.

While recent diffusion-based generative image codecs have shown impressive performance, their iterative sampling process introduces unpleasing latency. In this work, we revisit the design of a diffusion-based codec and argue that multi-step sampling is not necessary for generative compression. Based on this insight, we propose OneDC, a One-step Diffusion-based generative image Codec -- that integrates a latent compression module with a one-step diffusion generator. Recognizing the critical role of semantic guidance in one-step diffusion, we propose using the hyperprior as a semantic signal, overcoming the limitations of text prompts in representing complex visual content. To further enhance the semantic capability of the hyperprior, we introduce a semantic distillation mechanism that transfers knowledge from a pretrained generative tokenizer to the hyperprior codec. Additionally, we adopt a hybrid pixel- and latent-domain optimization to jointly enhance both reconstruction fidelity and perceptual realism. Extensive experiments demonstrate that OneDC achieves SOTA perceptual quality even with one-step generation, offering over 40% bitrate reduction and 20x faster decoding compared to prior multi-step diffusion-based codecs. Code will be released later.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes