CVAug 21, 2024

CARLA Drone: Monocular 3D Object Detection from a Different Perspective

arXiv:2408.11958v211 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the problem of limited camera perspective diversity in 3D detection for autonomous systems, representing an incremental advancement with domain-specific impact.

The paper tackles the limitation of monocular 3D object detection methods that perform well only on specific camera perspectives by introducing the CARLA Drone dataset (CDrone) and a data augmentation pipeline called GroundMix, achieving average precision on par with or higher than previous state-of-the-art across multiple datasets.

Existing techniques for monocular 3D detection have a serious restriction. They tend to perform well only on a limited set of benchmarks, faring well either on ego-centric car views or on traffic camera views, but rarely on both. To encourage progress, this work advocates for an extended evaluation of 3D detection frameworks across different camera perspectives. We make two key contributions. First, we introduce the CARLA Drone dataset, CDrone. Simulating drone views, it substantially expands the diversity of camera perspectives in existing benchmarks. Despite its synthetic nature, CDrone represents a real-world challenge. To show this, we confirm that previous techniques struggle to perform well both on CDrone and a real-world 3D drone dataset. Second, we develop an effective data augmentation pipeline called GroundMix. Its distinguishing element is the use of the ground for creating 3D-consistent augmentation of a training image. GroundMix significantly boosts the detection accuracy of a lightweight one-stage detector. In our expanded evaluation, we achieve the average precision on par with or substantially higher than the previous state of the art across all tested datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes