Joint Global and Local Hierarchical Priors for Learned Image Compression
This addresses a bottleneck in image compression for applications requiring high efficiency, though it is an incremental improvement over existing learned methods.
The paper tackles the limitation of CNNs in modeling long-range dependencies for learned image compression by proposing a novel entropy model called Informer that uses attention to exploit global and local information, improving rate-distortion performance over state-of-the-art methods on Kodak and Tecnick datasets.
Recently, learned image compression methods have outperformed traditional hand-crafted ones including BPG. One of the keys to this success is learned entropy models that estimate the probability distribution of the quantized latent representation. Like other vision tasks, most recent learned entropy models are based on convolutional neural networks (CNNs). However, CNNs have a limitation in modeling long-range dependencies due to their nature of local connectivity, which can be a significant bottleneck in image compression where reducing spatial redundancy is a key point. To overcome this issue, we propose a novel entropy model called Information Transformer (Informer) that exploits both global and local information in a content-dependent manner using an attention mechanism. Our experiments show that Informer improves rate--distortion performance over the state-of-the-art methods on the Kodak and Tecnick datasets without the quadratic computational complexity problem. Our source code is available at https://github.com/naver-ai/informer.