ROMar 29, 2025

Deep Visual Servoing of an Aerial Robot Using Keypoint Feature Extraction

arXiv:2503.231713 citationsh-index: 4
Originality Synthesis-oriented
AI Analysis

It provides a more robust and practical visual servoing method for aerial robots, but the improvement is incremental over existing deep learning approaches.

This paper addresses image-based visual servoing for aerial robots using deep-learning keypoint detection, eliminating the need for man-made markers and improving robustness to occlusion, illumination changes, and clutter. Simulations in ROS Gazebo demonstrate effectiveness.

The problem of image-based visual servoing (IBVS) of an aerial robot using deep-learning-based keypoint detection is addressed in this article. A monocular RGB camera mounted on the platform is utilized to collect the visual data. A convolutional neural network (CNN) is then employed to extract the features serving as the visual data for the servoing task. This paper contributes to the field by circumventing not only the challenge stemming from the need for man-made marker detection in conventional visual servoing techniques, but also enhancing the robustness against undesirable factors including occlusion, varying illumination, clutter, and background changes, thereby broadening the applicability of perception-guided motion control tasks in aerial robots. Additionally, extensive physics-based ROS Gazebo simulations are conducted to assess the effectiveness of this method, in contrast to many existing studies that rely solely on physics-less simulations. A demonstration video is available at https://youtu.be/Dd2Her8Ly-E.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes