CV ET PF ROOct 15, 2025

Accelerated Feature Detectors for Visual SLAM: A Comparative Study of FPGA vs GPU

arXiv:2510.13546v13.6

Originality Synthesis-oriented

AI Analysis

This study addresses the need for efficient hardware acceleration in Visual SLAM for resource-limited applications, but it is incremental as it provides a comparative analysis rather than introducing a new method.

This paper tackles the problem of accelerating feature detection in Visual SLAM for power-constrained platforms like drones by comparing FPGA and GPU implementations, finding that GPUs outperform FPGAs for non-learning-based detectors like FAST and Harris, while FPGAs achieve up to 3.1x better run-time performance and 1.4x energy efficiency for learning-based SuperPoint detectors.

Feature detection is a common yet time-consuming module in Simultaneous Localization and Mapping (SLAM) implementations, which are increasingly deployed on power-constrained platforms, such as drones. Graphics Processing Units (GPUs) have been a popular accelerator for computer vision in general, and feature detection and SLAM in particular. On the other hand, System-on-Chips (SoCs) with integrated Field Programmable Gate Array (FPGA) are also widely available. This paper presents the first study of hardware-accelerated feature detectors considering a Visual SLAM (V-SLAM) pipeline. We offer new insights by comparing the best GPU-accelerated FAST, Harris, and SuperPoint implementations against the FPGA-accelerated counterparts on modern SoCs (Nvidia Jetson Orin and AMD Versal). The evaluation shows that when using a non-learning-based feature detector such as FAST and Harris, their GPU implementations, and the GPU-accelerated V-SLAM can achieve better run-time performance and energy efficiency than the FAST and Harris FPGA implementations as well as the FPGA-accelerated V-SLAM. However, when considering a learning-based detector such as SuperPoint, its FPGA implementation can achieve better run-time performance and energy efficiency (up to 3.1$\times$ and 1.4$\times$ improvements, respectively) than the GPU implementation. The FPGA-accelerated V-SLAM can also achieve comparable run-time performance compared to the GPU-accelerated V-SLAM, with better FPS in 2 out of 5 dataset sequences. When considering the accuracy, the results show that the GPU-accelerated V-SLAM is more accurate than the FPGA-accelerated V-SLAM in general. Last but not least, the use of hardware acceleration for feature detection could further improve the performance of the V-SLAM pipeline by having the global bundle adjustment module invoked less frequently without sacrificing accuracy.

View on arXiv PDF

Similar