CVAINov 2, 2022

TSAA: A Two-Stage Anchor Assignment Method towards Anchor Drift in Crowded Object Detection

arXiv:2211.00826v21 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses a specific issue in crowded object detection for computer vision applications, offering an incremental improvement to existing anchor-based methods.

The paper tackles the problem of anchor drift in crowded object detection, where positive anchors may not regress toward the most overlapping object, leading to ambiguous predictions and higher false-positive rates. The proposed TSAA method uses prediction boxes instead of fixed anchors for assignment, significantly improving performance on detectors like RetinaNet, Faster-RCNN, and YOLOv3 without extra computational costs.

Among current anchor-based detectors, a positive anchor box will be intuitively assigned to the object that overlaps it the most. The assigned label to each anchor will directly determine the optimization direction of the corresponding prediction box, including the direction of box regression and category prediction. In our practice of crowded object detection, however, the results show that a positive anchor does not always regress toward the object that overlaps it the most when multiple objects overlap. We name it anchor drift. The anchor drift reflects that the anchor-object matching relation, which is determined by the degree of overlap between anchors and objects, is not always optimal. Conflicts between the fixed matching relation and learned experience in the past training process may cause ambiguous predictions and thus raise the false-positive rate. In this paper, a simple but efficient adaptive two-stage anchor assignment (TSAA) method is proposed. It utilizes the final prediction boxes rather than the fixed anchors to calculate the overlap degree with objects to determine which object to regress for each anchor. The participation of the prediction box makes the anchor-object assignment mechanism adaptive. Extensive experiments are conducted on three classic detectors RetinaNet, Faster-RCNN and YOLOv3 on CrowdHuman and COCO to evaluate the effectiveness of TSAA. The results show that TSAA can significantly improve the detectors' performance without additional computational costs or network structure changes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes