CVAIFeb 5, 2021

Custom Object Detection via Multi-Camera Self-Supervised Learning

arXiv:2102.03442v13 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of efficiently building custom object detection models for multi-camera networks, which is an incremental improvement for practitioners in surveillance and smart city applications.

This paper introduces MCSSL, a self-supervised learning method for custom object detection in multi-camera networks. It uses epipolar geometry, tracking, and reID to associate bounding boxes across cameras and generate pseudo-labels. MCSSL improves average mAP by 5.44% on WildTrack and 6.76% on CityFlow datasets compared to legacy self-training methods.

This paper proposes MCSSL, a self-supervised learning approach for building custom object detection models in multi-camera networks. MCSSL associates bounding boxes between cameras with overlapping fields of view by leveraging epipolar geometry and state-of-the-art tracking and reID algorithms, and prudently generates two sets of pseudo-labels to fine-tune backbone and detection networks respectively in an object detection model. To train effectively on pseudo-labels,a powerful reID-like pretext task with consistency loss is constructed for model customization. Our evaluation shows that compared with legacy selftraining methods, MCSSL improves average mAP by 5.44% and 6.76% on WildTrack and CityFlow dataset, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes