CUAHN-VIO: Content-and-Uncertainty-Aware Homography Network for Visual-Inertial Odometry
This work addresses the problem of real-world navigation for agile mobile robots, specifically micro aerial vehicles, by improving robustness and efficiency, though it appears incremental as it builds on existing VIO approaches.
The authors tackled robust ego-motion estimation for micro aerial vehicles by proposing CUAHN-VIO, a visual-inertial odometry system with a self-supervised homography network that predicts uncertainty, achieving rivaling accuracy to state-of-the-art methods with low inference time (~23ms) and enabling onboard navigation on an embedded processor.
Learning-based visual ego-motion estimation is promising yet not ready for navigating agile mobile robots in the real world. In this article, we propose CUAHN-VIO, a robust and efficient monocular visual-inertial odometry (VIO) designed for micro aerial vehicles (MAVs) equipped with a downward-facing camera. The vision frontend is a content-and-uncertainty-aware homography network (CUAHN) that is robust to non-homography image content and failure cases of network prediction. It not only predicts the homography transformation but also estimates its uncertainty. The training is self-supervised, so that it does not require ground truth that is often difficult to obtain. The network has good generalization that enables "plug-and-play" deployment in new environments without fine-tuning. A lightweight extended Kalman filter (EKF) serves as the VIO backend and utilizes the mean prediction and variance estimation from the network for visual measurement updates. CUAHN-VIO is evaluated on a high-speed public dataset and shows rivaling accuracy to state-of-the-art (SOTA) VIO approaches. Thanks to the robustness to motion blur, low network inference time (~23ms), and stable processing latency (~26ms), CUAHN-VIO successfully runs onboard an Nvidia Jetson TX2 embedded processor to navigate a fast autonomous MAV.