CVJul 14, 2024

Plain-Det: A Plain Multi-Dataset Object Detector

arXiv:2407.10083v112 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of data scarcity in dense computer vision tasks like object detection for researchers and practitioners, though it is incremental as it builds on existing methods like Def-DETR.

The paper tackles the challenge of training object detectors by combining multiple datasets to overcome annotation difficulties, achieving a mAP of 51.9 on COCO that matches state-of-the-art detectors and demonstrating strong generalization across 13 downstream datasets.

Recent advancements in large-scale foundational models have sparked widespread interest in training highly proficient large vision models. A common consensus revolves around the necessity of aggregating extensive, high-quality annotated data. However, given the inherent challenges in annotating dense tasks in computer vision, such as object detection and segmentation, a practical strategy is to combine and leverage all available data for training purposes. In this work, we propose Plain-Det, which offers flexibility to accommodate new datasets, robustness in performance across diverse datasets, training efficiency, and compatibility with various detection architectures. We utilize Def-DETR, with the assistance of Plain-Det, to achieve a mAP of 51.9 on COCO, matching the current state-of-the-art detectors. We conduct extensive experiments on 13 downstream datasets and Plain-Det demonstrates strong generalization capability. Code is release at https://github.com/ChengShiest/Plain-Det

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes