City-Scale Multi-Camera Vehicle Tracking System with Improved Self-Supervised Camera Link Model
This work addresses the challenge of efficient and cost-effective deployment for city-scale vehicle tracking systems, such as traffic management, by eliminating the need for human annotation, though it is incremental as it builds on existing camera-link methods.
The paper tackles the problem of matching vehicle trajectories across different cameras in multi-camera tracking by introducing a self-supervised camera link model that automatically extracts relationships without manual annotations, achieving a state-of-the-art IDF1 score of 61.07% on CityFlow V2 benchmarks.
Multi-Target Multi-Camera Tracking (MTMCT) has broad applications and forms the basis for numerous future city-wide systems (e.g. traffic management, crash detection, etc.). However, the challenge of matching vehicle trajectories across different cameras based solely on feature extraction poses significant difficulties. This article introduces an innovative multi-camera vehicle tracking system that utilizes a self-supervised camera link model. In contrast to related works that rely on manual spatial-temporal annotations, our model automatically extracts crucial multi-camera relationships for vehicle matching. The camera link is established through a pre-matching process that evaluates feature similarities, pair numbers, and time variance for high-quality tracks. This process calculates the probability of spatial linkage for all camera combinations, selecting the highest scoring pairs to create camera links. Our approach significantly improves deployment times by eliminating the need for human annotation, offering substantial improvements in efficiency and cost-effectiveness when it comes to real-world application. This pairing process supports cross camera matching by setting spatial-temporal constraints, reducing the searching space for potential vehicle matches. According to our experimental results, the proposed method achieves a new state-of-the-art among automatic camera-link based methods in CityFlow V2 benchmarks with 61.07% IDF1 Score.