CVApr 15, 2022

FasterVideo: Efficient Online Joint Object Detection And Tracking

arXiv:2204.07394v14 citationsh-index: 27
Originality Incremental advance
AI Analysis

This work addresses efficiency gaps in video perception systems for real-world applications, though it is incremental as it builds on an existing image detection method.

The paper tackles the problem of computationally demanding object detection and tracking in videos by extending Faster R-CNN to learn instance-level embeddings for data association, achieving high computational efficiency while competing with state-of-the-art methods on standard benchmarks.

Object detection and tracking in videos represent essential and computationally demanding building blocks for current and future visual perception systems. In order to reduce the efficiency gap between available methods and computational requirements of real-world applications, we propose to re-think one of the most successful methods for image object detection, Faster R-CNN, and extend it to the video domain. Specifically, we extend the detection framework to learn instance-level embeddings which prove beneficial for data association and re-identification purposes. Focusing on the computational aspects of detection and tracking, our proposed method reaches a very high computational efficiency necessary for relevant applications, while still managing to compete with recent and state-of-the-art methods as shown in the experiments we conduct on standard object tracking benchmarks

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes