Multi-Realism Image Compression with a Conditional Generator
This addresses concerns for users of generative compression methods who need reliable reconstructions without misleading details, offering incremental improvements in control and performance.
The paper tackles the lack of explicit control over detail synthesis in generative image compression, which can produce misleading reconstructions, by introducing a decoder that allows users to navigate the distortion-realism trade-off from a single compressed representation. It sets a new state-of-the-art in distortion-realism, achieving better distortions at high realism and better realism at low distortion than previous methods.
By optimizing the rate-distortion-realism trade-off, generative compression approaches produce detailed, realistic images, even at low bit rates, instead of the blurry reconstructions produced by rate-distortion optimized models. However, previous methods do not explicitly control how much detail is synthesized, which results in a common criticism of these methods: users might be worried that a misleading reconstruction far from the input image is generated. In this work, we alleviate these concerns by training a decoder that can bridge the two regimes and navigate the distortion-realism trade-off. From a single compressed representation, the receiver can decide to either reconstruct a low mean squared error reconstruction that is close to the input, a realistic reconstruction with high perceptual quality, or anything in between. With our method, we set a new state-of-the-art in distortion-realism, pushing the frontier of achievable distortion-realism pairs, i.e., our method achieves better distortions at high realism and better realism at low distortion than ever before.