CVMar 2, 2023

Task-Specific Context Decoupling for Object Detection

arXiv:2303.01047v156 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in object detection for computer vision applications, offering an incremental improvement over existing methods.

The paper tackles the problem of inconsistent feature context preferences between classification and localization in object detection by proposing a Task-Specific Context Decoupling (TSCODE) head, which improves different detectors by over 1.0 AP with less computational cost.

Classification and localization are two main sub-tasks in object detection. Nonetheless, these two tasks have inconsistent preferences for feature context, i.e., localization expects more boundary-aware features to accurately regress the bounding box, while more semantic context is preferred for object classification. Exsiting methods usually leverage disentangled heads to learn different feature context for each task. However, the heads are still applied on the same input features, which leads to an imperfect balance between classifcation and localization. In this work, we propose a novel Task-Specific COntext DEcoupling (TSCODE) head which further disentangles the feature encoding for two tasks. For classification, we generate spatially-coarse but semantically-strong feature encoding. For localization, we provide high-resolution feature map containing more edge information to better regress object boundaries. TSCODE is plug-and-play and can be easily incorperated into existing detection pipelines. Extensive experiments demonstrate that our method stably improves different detectors by over 1.0 AP with less computational cost. Our code and models will be publicly released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes