CVAug 18, 2019

A Delay Metric for Video Object Detection: What Average Precision Fails to Tell

arXiv:1908.06368v20.0044 citations
AI Analysis25

This work addresses the need for better evaluation metrics in video object detection, particularly for latency-critical applications like autonomous vehicles, though it is incremental as it builds on existing detection methods.

The authors identified that average precision (AP) is insufficient for evaluating video object detection due to its inability to capture temporal aspects, and they proposed a new metric called average delay (AD) to measure detection delay, showing that most methods increase delay significantly while maintaining AP.

Average precision (AP) is a widely used metric to evaluate detection accuracy of image and video object detectors. In this paper, we analyze object detection from videos and point out that AP alone is not sufficient to capture the temporal nature of video object detection. To tackle this problem, we propose a comprehensive metric, average delay (AD), to measure and compare detection delay. To facilitate delay evaluation, we carefully select a subset of ImageNet VID, which we name as ImageNet VIDT with an emphasis on complex trajectories. By extensively evaluating a wide range of detectors on VIDT, we show that most methods drastically increase the detection delay but still preserve AP well. In other words, AP is not sensitive enough to reflect the temporal characteristics of a video object detector. Our results suggest that video object detection methods should be additionally evaluated with a delay metric, particularly for latency-critical applications such as autonomous vehicle perception.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes