Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs
This work addresses the problem of improving neural image compression for applications requiring efficient storage and transmission, though it is incremental as it builds on existing VAE and rate-distortion theory.
The paper tackles the performance gap between theoretical rate-distortion bounds and existing neural image codecs by proposing a theoretical bound-guided hierarchical VAE, resulting in a variable-rate codec that outperforms prior methods in rate-distortion and computational complexity.
Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to estimate the theoretical upper bound of the information rate-distortion function of images. Such estimated theoretical bounds substantially exceed the performance of existing neural image codecs (NICs). To narrow this gap, we propose a theoretical bound-guided hierarchical VAE (BG-VAE) for NIC. The proposed BG-VAE leverages the theoretical bound to guide the NIC model towards enhanced performance. We implement the BG-VAE using Hierarchical VAEs and demonstrate its effectiveness through extensive experiments. Along with advanced neural network blocks, we provide a versatile, variable-rate NIC that outperforms existing methods when considering both rate-distortion performance and computational complexity. The code is available at BG-VAE.