IVCVMay 17, 2023

VVC+M: Plug and Play Scalable Image Coding for Humans and Machines

arXiv:2305.10453v15 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient image compression for automated analysis and human viewing, offering a plug-and-play solution that improves machine task performance without compromising human perception, though it is incremental as it builds on existing codecs.

The paper tackles the problem of sub-optimal rate-distortion performance in scalable image coding for both humans and machines by proposing a method that uses VVC residual coding to enhance existing image compression for machines schemes, achieving superior machine task performance while remaining competitive for human perception.

Compression for machines is an emerging field, where inputs are encoded while optimizing the performance of downstream automated analysis. In scalable coding for humans and machines, the compressed representation used for machines is further utilized to enable input reconstruction. Often performed by jointly optimizing the compression scheme for both machine task and human perception, this results in sub-optimal rate-distortion (RD) performance for the machine side. We focus on the case of images, proposing to utilize the pre-existing residual coding capabilities of video codecs such as VVC to create a scalable codec from any image compression for machines (ICM) scheme. Using our approach we improve an existing scalable codec to achieve superior RD performance on the machine task, while remaining competitive for human perception. Moreover, our approach can be trained post-hoc for any given ICM scheme, and without creating a coupling between the quality of the machine analysis and human vision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes