IVCVMMDec 17, 2025

Generative Preprocessing for Image Compression with Pre-trained Diffusion Models

arXiv:2512.15270v1h-index: 11
Originality Highly original
AI Analysis

This work addresses image compression quality for applications requiring high visual fidelity, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of image compression by shifting from Rate-Distortion to Rate-Perception optimization, using a pre-trained diffusion model for preprocessing to enhance texture and reduce artifacts, resulting in up to a 30.13% BD-rate reduction in DISTS on the Kodak dataset.

Preprocessing is a well-established technique for optimizing compression, yet existing methods are predominantly Rate-Distortion (R-D) optimized and constrained by pixel-level fidelity. This work pioneers a shift towards Rate-Perception (R-P) optimization by, for the first time, adapting a large-scale pre-trained diffusion model for compression preprocessing. We propose a two-stage framework: first, we distill the multi-step Stable Diffusion 2.1 into a compact, one-step image-to-image model using Consistent Score Identity Distillation (CiD). Second, we perform a parameter-efficient fine-tuning of the distilled model's attention modules, guided by a Rate-Perception loss and a differentiable codec surrogate. Our method seamlessly integrates with standard codecs without any modification and leverages the model's powerful generative priors to enhance texture and mitigate artifacts. Experiments show substantial R-P gains, achieving up to a 30.13% BD-rate reduction in DISTS on the Kodak dataset and delivering superior subjective visual quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes