CVAIDec 7, 2020

End-to-End Object Detection with Fully Convolutional Network

arXiv:2012.03544v3239 citationsHas Code
AI Analysis

This work aims to simplify the training pipeline for object detection researchers by removing the need for hand-designed NMS post-processing, offering an incremental improvement to existing fully convolutional detectors.

This paper addresses the need for non-maximum suppression (NMS) in fully convolutional object detectors, proposing a Prediction-aware One-To-One (POTO) label assignment and a 3D Max Filtering (3DMF) technique. The resulting end-to-end framework achieves competitive performance on COCO and CrowdHuman datasets without NMS.

Mainstream object detectors based on the fully convolutional network has achieved impressive performance. While most of them still need a hand-designed non-maximum suppression (NMS) post-processing, which impedes fully end-to-end training. In this paper, we give the analysis of discarding NMS, where the results reveal that a proper label assignment plays a crucial role. To this end, for fully convolutional detectors, we introduce a Prediction-aware One-To-One (POTO) label assignment for classification to enable end-to-end detection, which obtains comparable performance with NMS. Besides, a simple 3D Max Filtering (3DMF) is proposed to utilize the multi-scale features and improve the discriminability of convolutions in the local region. With these techniques, our end-to-end framework achieves competitive performance against many state-of-the-art detectors with NMS on COCO and CrowdHuman datasets. The code is available at https://github.com/Megvii-BaseDetection/DeFCN .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes