Conditional and Residual Methods in Scalable Coding for Humans and Machines
This work addresses scalable coding for computer vision tasks like semantic segmentation and object detection, but it is incremental as it builds on existing methods with similar performance.
The paper tackled the problem of optimizing rate-distortion performance in scalable coding for humans and machines by proposing conditional and residual methods, resulting in similar performance between both methods with rate-distortion curves contained within baselines on Cityscapes and COCO datasets.
We present methods for conditional and residual coding in the context of scalable coding for humans and machines. Our focus is on optimizing the rate-distortion performance of the reconstruction task using the information available in the computer vision task. We include an information analysis of both approaches to provide baselines and also propose an entropy model suitable for conditional coding with increased modelling capacity and similar tractability as previous work. We apply these methods to image reconstruction, using, in one instance, representations created for semantic segmentation on the Cityscapes dataset, and in another instance, representations created for object detection on the COCO dataset. In both experiments, we obtain similar performance between the conditional and residual methods, with the resulting rate-distortion curves contained within our baselines.