IV CV LG MLMay 26, 2023

High-Fidelity Image Compression with Score-based Generative Models

Emiel Hoogeboom, Eirikur Agustsson, Fabian Mentzer, Luca Versari, George Toderici, Lucas Theis

arXiv:2305.18231v330.068 citations

Originality Incremental advance

AI Analysis

This work addresses image compression for applications requiring high perceptual quality, but it is incremental as it adapts existing diffusion methods to a new domain.

The paper tackles the challenge of applying diffusion generative models to image compression, demonstrating that a two-stage approach combining an autoencoder and score-based decoder improves perceptual quality at a given bit-rate, outperforming state-of-the-art methods like PO-ELIC and HiFiC as measured by FID score.

Despite the tremendous success of diffusion generative models in text-to-image generation, replicating this success in the domain of image compression has proven difficult. In this paper, we demonstrate that diffusion can significantly improve perceptual quality at a given bit-rate, outperforming state-of-the-art approaches PO-ELIC and HiFiC as measured by FID score. This is achieved using a simple but theoretically motivated two-stage approach combining an autoencoder targeting MSE followed by a further score-based decoder. However, as we will show, implementation details matter and the optimal design decisions can differ greatly from typical text-to-image models.

View on arXiv PDF

Similar