Real-time Monocular Depth Estimation on Embedded Systems
This work addresses the need for real-time depth sensing in unmanned aerial and autonomous vehicles, representing an incremental improvement in efficiency for embedded platforms.
The paper tackled the problem of slow monocular depth estimation on embedded systems by proposing two lightweight architectures, RT-MonoDepth and RT-MonoDepth-S, which achieved frame rates up to 364.1 FPS on Jetson AGX Orin while maintaining accuracy comparable to prior methods.
Depth sensing is of paramount importance for unmanned aerial and autonomous vehicles. Nonetheless, contemporary monocular depth estimation methods employing complex deep neural networks within Convolutional Neural Networks are inadequately expedient for real-time inference on embedded platforms. This paper endeavors to surmount this challenge by proposing two efficient and lightweight architectures, RT-MonoDepth and RT-MonoDepth-S, thereby mitigating computational complexity and latency. Our methodologies not only attain accuracy comparable to prior depth estimation methods but also yield faster inference speeds. Specifically, RT-MonoDepth and RT-MonoDepth-S achieve frame rates of 18.4&30.5 FPS on NVIDIA Jetson Nano and 253.0&364.1 FPS on Jetson AGX Orin, utilizing a single RGB image of resolution 640x192. The experimental results underscore the superior accuracy and faster inference speed of our methods in comparison to existing fast monocular depth estimation methodologies on the KITTI dataset.