Toward Minimal Misalignment at Minimal Cost in One-Stage and Anchor-Free Object Detection
This addresses a specific architectural issue in object detection models, offering a simple and efficient solution for improved accuracy.
The paper tackles the misalignment problem between classification and regression branches in one-stage anchor-free object detectors, showing it consists of scale and spatial misalignment. Their method, involving minor head adjustments and a new label assignment, achieves around 3 AP improvement over FCOS baselines across different backbones.
Common object detection models consist of classification and regression branches, due to different task drivers, these two branches have different sensibility to the features from the same scale level and the same spatial location. The point-based prediction method, which is based on the assumption that the high classification confidence point has the high regression quality, leads to the misalignment problem. Our analysis shows, the problem is further composed of scale misalignment and spatial misalignment specifically. We aim to resolve the phenomenon at minimal cost: a minor adjustment of the head network and a new label assignment method replacing the rigid one. Our experiments show that, compared to the baseline FCOS, a one-stage and anchor-free object detection model, our model consistently get around 3 AP improvement with different backbones, demonstrating both simplicity and efficiency of our method.