MMCVDec 17, 2025

A Preprocessing Framework for Video Machine Vision under Compression

arXiv:2512.15331v13 citationsh-index: 8DCC
Originality Incremental advance
AI Analysis

This addresses the need for efficient video compression tailored to machine vision systems, offering a practical solution for real-world applications, though it is incremental as it builds on existing codecs.

The paper tackles the problem that video compression methods optimized for human perception overlook machine vision demands, proposing a preprocessing framework that saves over 15% bitrate compared to standard codecs while maintaining accuracy.

There has been a growing trend in compressing and transmitting videos from terminals for machine vision tasks. Nevertheless, most video coding optimization method focus on minimizing distortion according to human perceptual metrics, overlooking the heightened demands posed by machine vision systems. In this paper, we propose a video preprocessing framework tailored for machine vision tasks to address this challenge. The proposed method incorporates a neural preprocessor which retaining crucial information for subsequent tasks, resulting in the boosting of rate-accuracy performance. We further introduce a differentiable virtual codec to provide constraints on rate and distortion during the training stage. We directly apply widely used standard codecs for testing. Therefore, our solution can be easily applied to real-world scenarios. We conducted extensive experiments evaluating our compression method on two typical downstream tasks with various backbone networks. The experimental results indicate that our approach can save over 15% of bitrate compared to using only the standard codec anchor version.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes