Deep Visual Re-Identification with Confidence
This work addresses the challenge of tracking vehicles or pedestrians in scattered camera systems, such as in transportation monitoring, with incremental improvements over existing methods.
The paper tackled the problem of visual re-identification across non-overlapping camera views by proposing a confidence-based learning framework to improve representation learning, resulting in performance boosts across 5 datasets and outperforming specialized state-of-the-art methods.
Transportation systems often rely on understanding the flow of vehicles or pedestrian. From traffic monitoring at the city scale, to commuters in train terminals, recent progress in sensing technology make it possible to use cameras to better understand the demand, i.e., better track moving agents (e.g., vehicles and pedestrians). Whether the cameras are mounted on drones, vehicles, or fixed in the built environments, they inevitably remain scatter. We need to develop the technology to re-identify the same agents across images captured from non-overlapping field-of-views, referred to as the visual re-identification task. State-of-the-art methods learn a neural network based representation trained with the cross-entropy loss function. We argue that such loss function is not suited for the visual re-identification task hence propose to model confidence in the representation learning framework. We show the impact of our confidence-based learning framework with three methods: label smoothing, confidence penalty, and deep variational information bottleneck. They all show a boost in performance validating our claim. Our contribution is generic to any agent of interest, i.e., vehicles or pedestrians, and outperform highly specialized state-of-the-art methods across 5 datasets. The source code and models are shared towards an open science mission.