ROCVSep 3, 2024

YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers

arXiv:2409.02334v22 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and stable UAV navigation for applications like human collaboration, though it is incremental as it builds on existing marker-based methods with optimizations for real-time constraints.

The paper tackles the problem of real-time UAV navigation using fiducial markers by proposing YoloTag, which combines a lightweight YOLO v8 detector with a Butterworth filter to reduce noise, achieving improved trajectory tracking performance in indoor experiments.

By harnessing fiducial markers as visual landmarks in the environment, Unmanned Aerial Vehicles (UAVs) can rapidly build precise maps and navigate spaces safely and efficiently, unlocking their potential for fluent collaboration and coexistence with humans. Existing fiducial marker methods rely on handcrafted feature extraction, which sacrifices accuracy. On the other hand, deep learning pipelines for marker detection fail to meet real-time runtime constraints crucial for navigation applications. In this work, we propose YoloTag -a real-time fiducial marker-based localization system. YoloTag uses a lightweight YOLO v8 object detector to accurately detect fiducial markers in images while meeting the runtime constraints needed for navigation. The detected markers are then used by an efficient perspective-n-point algorithm to estimate UAV states. However, this localization system introduces noise, causing instability in trajectory tracking. To suppress noise, we design a higher-order Butterworth filter that effectively eliminates noise through frequency domain analysis. We evaluate our algorithm through real-robot experiments in an indoor environment, comparing the trajectory tracking performance of our method against other approaches in terms of several distance metrics.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes