IVCVLGJul 18, 2023

ECSIC: Epipolar Cross Attention for Stereo Image Compression

arXiv:2307.10284v214 citationsh-index: 31
Originality Incremental advance
AI Analysis

This addresses the problem of efficient compression for stereo images, which is incremental as it builds on existing learned compression methods with specific architectural improvements.

The paper tackles stereo image compression by proposing ECSIC, a learned method that uses a novel stereo cross attention module and stereo context modules to exploit mutual information between left and right images, achieving state-of-the-art performance on Cityscapes and InStereo2k datasets.

In this paper, we present ECSIC, a novel learned method for stereo image compression. Our proposed method compresses the left and right images in a joint manner by exploiting the mutual information between the images of the stereo image pair using a novel stereo cross attention (SCA) module and two stereo context modules. The SCA module performs cross-attention restricted to the corresponding epipolar lines of the two images and processes them in parallel. The stereo context modules improve the entropy estimation of the second encoded image by using the first image as a context. We conduct an extensive ablation study demonstrating the effectiveness of the proposed modules and a comprehensive quantitative and qualitative comparison with existing methods. ECSIC achieves state-of-the-art performance in stereo image compression on the two popular stereo image datasets Cityscapes and InStereo2k while allowing for fast encoding and decoding.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes