CVSPMay 16, 2024

Towards Task-Compatible Compressible Representations

arXiv:2405.10244v33 citationsh-index: 52024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)
Originality Incremental advance
AI Analysis

This addresses a bottleneck in scalable coding for computer vision by enhancing representation compatibility across tasks, though it is incremental as it builds on prior work in learnable compression.

The paper tackles the problem of multi-task learnable compression where representations for one task underperform for others, using the predictive V-information framework to improve task compatibility. The method, tested on object detection and depth estimation tasks, shows considerable improvements in rate-distortion performance for assisted tasks like image reconstruction and semantic segmentation, with base tasks also seeing gains.

We identify an issue in multi-task learnable compression, in which a representation learned for one task does not positively contribute to the rate-distortion performance of a different task as much as expected, given the estimated amount of information available in it. We interpret this issue using the predictive $\mathcal{V}$-information framework. In learnable scalable coding, previous work increased the utilization of side-information for input reconstruction by also rewarding input reconstruction when learning this shared representation. We evaluate the impact of this idea in the context of input reconstruction more rigorously and extended it to other computer vision tasks. We perform experiments using representations trained for object detection on COCO 2017 and depth estimation on the Cityscapes dataset, and use them to assist in image reconstruction and semantic segmentation tasks. The results show considerable improvements in the rate-distortion performance of the assisted tasks. Moreover, using the proposed representations, the performance of the base tasks are also improved. Results suggest that the proposed method induces simpler representations that are more compatible with downstream processes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes