CV LG MLJan 31, 2024

Robustly overfitting latents for flexible neural image compression

Yura Perugachi-Diaz, Arwin Gansekoele, Sandjai Bhulai

arXiv:2401.17789v36.54 citationsh-index: 25NIPS

Originality Incremental advance

AI Analysis

This work addresses incremental improvements in neural image compression for applications requiring efficient image transmission.

The paper tackles sub-optimal performance in neural image compression models by introducing SGA+, a method that refines latents to improve the rate-distortion trade-off, showing gains on datasets like Tecnick and CLIC.

Neural image compression has made a great deal of progress. State-of-the-art models are based on variational autoencoders and are outperforming classical models. Neural compression models learn to encode an image into a quantized latent representation that can be efficiently sent to the decoder, which decodes the quantized latent into a reconstructed image. While these models have proven successful in practice, they lead to sub-optimal results due to imperfect optimization and limitations in the encoder and decoder capacity. Recent work shows how to use stochastic Gumbel annealing (SGA) to refine the latents of pre-trained neural image compression models. We extend this idea by introducing SGA+, which contains three different methods that build upon SGA. We show how our method improves the overall compression performance in terms of the R-D trade-off, compared to its predecessors. Additionally, we show how refinement of the latents with our best-performing method improves the compression performance on both the Tecnick and CLIC dataset. Our method is deployed for a pre-trained hyperprior and for a more flexible model. Further, we give a detailed analysis of our proposed methods and show that they are less sensitive to hyperparameter choices. Finally, we show how each method can be extended to three- instead of two-class rounding.

View on arXiv PDF

Similar