IVAICVJul 15, 2024

Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model

arXiv:2407.10632v211 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses compression inefficiencies in stereo vision applications, representing an incremental advancement over previous unidirectional approaches.

The paper tackled the problem of imbalanced compression in stereo images by introducing a symmetric bidirectional architecture called BiSIC, which outperformed conventional and learning-based methods with improvements in PSNR and MS-SSIM metrics.

With the rapid advancement of stereo vision technologies, stereo image compression has emerged as a crucial field that continues to draw significant attention. Previous approaches have primarily employed a unidirectional paradigm, where the compression of one view is dependent on the other, resulting in imbalanced compression. To address this issue, we introduce a symmetric bidirectional stereo image compression architecture, named BiSIC. Specifically, we propose a 3D convolution based codec backbone to capture local features and incorporate bidirectional attention blocks to exploit global features. Moreover, we design a novel cross-dimensional entropy model that integrates various conditioning factors, including the spatial context, channel context, and stereo dependency, to effectively estimate the distribution of latent representations for entropy coding. Extensive experiments demonstrate that our proposed BiSIC outperforms conventional image/video compression standards, as well as state-of-the-art learning-based methods, in terms of both PSNR and MS-SSIM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes