CVIVFeb 20, 2025

Stereo Image Coding for Machines with Joint Visual Feature Compression

arXiv:2502.14190v11 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient stereo image coding for machines, which is incremental as it extends 2D image coding for machines to the stereo domain.

The paper tackles stereo image compression for machine vision by proposing a network that extracts and compresses stereo visual features, achieving superior compression efficiency and 3D task performance compared to existing methods.

2D image coding for machines (ICM) has achieved great success in coding efficiency, while less effort has been devoted to stereo image fields. To promote the efficiency of stereo image compression (SIC) and intelligent analysis, the stereo image coding for machines (SICM) is formulated and explored in this paper. More specifically, a machine vision-oriented stereo feature compression network (MVSFC-Net) is proposed for SICM, where the stereo visual features are effectively extracted, compressed, and transmitted for 3D visual task. To efficiently compress stereo visual features in MVSFC-Net, a stereo multi-scale feature compression (SMFC) module is designed to gradually transform sparse stereo multi-scale features into compact joint visual representations by removing spatial, inter-view, and cross-scale redundancies simultaneously. Experimental results show that the proposed MVSFC-Net obtains superior compression efficiency as well as 3D visual task performance, when compared with the existing ICM anchors recommended by MPEG and the state-of-the-art SIC method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes