GAN-based Image Compression with Improved RDO Process
This work addresses image compression quality issues for applications requiring high perceptual fidelity at low bit rates, representing an incremental improvement over prior GAN-based methods.
The paper tackles perceptual degeneration and inaccurate entropy modeling in GAN-based image compression by introducing an improved rate-distortion optimization process using DISTS and MS-SSIM metrics and a discretized Gaussian-Laplacian-logistic mixture model, resulting in outperforming existing GAN-based methods and the state-of-the-art VVC codec in human perceptual quality as measured by Mean Opinion Score.
GAN-based image compression schemes have shown remarkable progress lately due to their high perceptual quality at low bit rates. However, there are two main issues, including 1) the reconstructed image perceptual degeneration in color, texture, and structure as well as 2) the inaccurate entropy model. In this paper, we present a novel GAN-based image compression approach with improved rate-distortion optimization (RDO) process. To achieve this, we utilize the DISTS and MS-SSIM metrics to measure perceptual degeneration in color, texture, and structure. Besides, we absorb the discretized gaussian-laplacian-logistic mixture model (GLLMM) for entropy modeling to improve the accuracy in estimating the probability distributions of the latent representation. During the evaluation process, instead of evaluating the perceptual quality of the reconstructed image via IQA metrics, we directly conduct the Mean Opinion Score (MOS) experiment among different codecs, which fully reflects the actual perceptual results of humans. Experimental results demonstrate that the proposed method outperforms the existing GAN-based methods and the state-of-the-art hybrid codec (i.e., VVC).