CVApr 15

CoD-Lite: Real-Time Diffusion-Based Generative Image Compression

arXiv:2604.1252582.1h-index: 11Has Code
Predicted impact top 25% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners needing real-time generative image compression, this work provides a lightweight solution that matches prior generative codec quality while enabling real-time throughput.

The paper introduces CoD-Lite, a one-step lightweight convolution diffusion codec that achieves real-time 60 FPS encoding and 42 FPS decoding at 1080p, reducing bitrate by 85% at comparable FID to MS-ILLM, bridging generative compression with practical deployment.

Recent advanced diffusion methods typically derive strong generative priors by scaling diffusion transformers. However, scaling fails to generalize when adapted for real-time compression scenarios that demand lightweight models. In this paper, we explore the design of real-time and lightweight diffusion codecs by addressing two pivotal questions. First, does diffusion pre-training benefit lightweight diffusion codecs? Through systematic analysis, we find that generation-oriented pre-training is less effective at small model scales whereas compression-oriented pre-training yields consistently better performance. Second, are transformers essential? We find that while global attention is crucial for standard generation, lightweight convolutions suffice for compression-oriented diffusion when paired with distillation. Guided by these findings, we establish a one-step lightweight convolution diffusion codec that achieves real-time $60$~FPS encoding and $42$~FPS decoding at 1080p. Further enhanced by distillation and adversarial learning, the proposed codec reduces bitrate by 85\% at a comparable FID to MS-ILLM, bridging the gap between generative compression and practical real-time deployment. Codes are released at https://github.com/microsoft/GenCodec/tree/main/CoD_Lite

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes