CVApr 18, 2021

Motion Vector Extrapolation for Video Object Detection

arXiv:2104.08918v25.67 citationsHas Code

Originality Incremental advance

AI Analysis

This enables low-latency video object detection on CPU systems, making it accessible beyond GPU computing.

The paper tackles the speed-accuracy-resource tradeoff in video object detection by combining off-the-shelf object detectors with optical flow motion estimation, achieving up to 25x latency reduction on the MOT20 dataset with minimal accuracy loss.

Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this trilemma are bottlenecked by the state-of-the-art in object detection models. We present, a technique which performs video object detection through the use of off-the-shelf object detectors alongside existing optical flow based motion estimation techniques in parallel. Through a set of experiments on the benchmark MOT20 dataset, we demonstrate that our approach significantly reduces the baseline latency of any given object detector without sacrificing any accuracy. Further latency reduction, up to 25x lower than the original latency, can be achieved with minimal accuracy loss. MOVEX enables low latency video object detection on common CPU based systems, thus allowing for high performance video object detection beyond the domain of GPU computing. The code is available at https://github.com/juliantrue/movex.

View on arXiv PDF Code

Similar