IVCVAug 15, 2022

Task Oriented Video Coding: A Survey

arXiv:2208.07313v35 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses the inefficiency of video compression for machine-based analysis, which is increasingly common in computer vision applications, but it is a survey paper and thus incremental in nature.

The paper surveys recent progress in video coding optimized for computer vision tasks, highlighting that conventional standards designed for human viewing are suboptimal when videos are analyzed by deep neural networks.

Video coding technology has been continuously improved for higher compression ratio with higher resolution. However, the state-of-the-art video coding standards, such as H.265/HEVC and Versatile Video Coding, are still designed with the assumption the compressed video will be watched by humans. With the tremendous advance and maturation of deep neural networks in solving computer vision tasks, more and more videos are directly analyzed by deep neural networks without humans' involvement. Such a conventional design for video coding standard is not optimal when the compressed video is used by computer vision applications. While the human visual system is consistently sensitive to the content with high contrast, the impact of pixels on computer vision algorithms is driven by specific computer vision tasks. In this paper, we explore and summarize recent progress on computer vision task oriented video coding and emerging video coding standard, Video Coding for Machines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes