CVAISep 23, 2023

UniHead: Unifying Multi-Perception for Detection Heads

Tencent
arXiv:2309.13242v222 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This addresses a bottleneck in object detection for computer vision researchers and practitioners, offering a plug-and-play solution that is incremental but provides strong specific gains.

The paper tackles the problem of detection heads in object detectors lacking comprehensive perceptual capabilities by developing UniHead, a unified detection head that integrates deformation, global, and cross-task perception, achieving improvements of +2.7 AP on RetinaNet, +2.9 AP on FreeAnchor, and +2.1 AP on GFL on the COCO dataset.

The detection head constitutes a pivotal component within object detectors, tasked with executing both classification and localization functions. Regrettably, the commonly used parallel head often lacks omni perceptual capabilities, such as deformation perception, global perception and cross-task perception. Despite numerous methods attempting to enhance these abilities from a single aspect, achieving a comprehensive and unified solution remains a significant challenge. In response to this challenge, we develop an innovative detection head, termed UniHead, to unify three perceptual abilities simultaneously. More precisely, our approach (1) introduces deformation perception, enabling the model to adaptively sample object features; (2) proposes a Dual-axial Aggregation Transformer (DAT) to adeptly model long-range dependencies, thereby achieving global perception; and (3) devises a Cross-task Interaction Transformer (CIT) that facilitates interaction between the classification and localization branches, thus aligning the two tasks. As a plug-and-play method, the proposed UniHead can be conveniently integrated with existing detectors. Extensive experiments on the COCO dataset demonstrate that our UniHead can bring significant improvements to many detectors. For instance, the UniHead can obtain +2.7 AP gains in RetinaNet, +2.9 AP gains in FreeAnchor, and +2.1 AP gains in GFL. The code is available at https://github.com/zht8506/UniHead.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes