CVJul 7, 2022

Should All Proposals be Treated Equally in Object Detection?

arXiv:2207.03520v15 citationsh-index: 83
Originality Highly original
AI Analysis

This addresses efficiency challenges for resource-constrained vision tasks, offering a novel approach to improve object detection performance without increasing computational budget.

The paper tackles the complexity-precision trade-off in object detection by proposing unequal processing of proposals, where more computation is assigned to better proposals, resulting in higher accuracy for the same computational cost. It shows that dynamic proposal processing outperforms state-of-the-art detectors like DETR and Sparse R-CNN by a clear margin.

The complexity-precision trade-off of an object detector is a critical problem for resource constrained vision tasks. Previous works have emphasized detectors implemented with efficient backbones. The impact on this trade-off of proposal processing by the detection head is investigated in this work. It is hypothesized that improved detection efficiency requires a paradigm shift, towards the unequal processing of proposals, assigning more computation to good proposals than poor ones. This results in better utilization of available computational budget, enabling higher accuracy for the same FLOPS. We formulate this as a learning problem where the goal is to assign operators to proposals, in the detection head, so that the total computational cost is constrained and the precision is maximized. The key finding is that such matching can be learned as a function that maps each proposal embedding into a one-hot code over operators. While this function induces a complex dynamic network routing mechanism, it can be implemented by a simple MLP and learned end-to-end with off-the-shelf object detectors. This 'dynamic proposal processing' (DPP) is shown to outperform state-of-the-art end-to-end object detectors (DETR, Sparse R-CNN) by a clear margin for a given computational complexity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes