Visual Depth Mapping from Monocular Images using Recurrent Convolutional Neural Networks
This addresses the need for low-cost, lightweight collision avoidance sensors for small unmanned vehicles, though it is incremental as it builds on existing deep learning approaches.
The paper tackles the problem of enabling safe autonomous operation of unmanned aircraft by developing a method to estimate object distances from visual image sequences, using a deep recurrent convolutional neural network trained on simulated data from AirSim. The result shows superior performance compared to prior methods and demonstrates applicability for sense-and-avoid of obstacles in simulation.
A reliable sense-and-avoid system is critical to enabling safe autonomous operation of unmanned aircraft. Existing sense-and-avoid methods often require specialized sensors that are too large or power intensive for use on small unmanned vehicles. This paper presents a method to estimate object distances based on visual image sequences, allowing for the use of low-cost, on-board monocular cameras as simple collision avoidance sensors. We present a deep recurrent convolutional neural network and training method to generate depth maps from video sequences. Our network is trained using simulated camera and depth data generated with Microsoft's AirSim simulator. Empirically, we show that our model achieves superior performance compared to models generated using prior methods.We further demonstrate that the method can be used for sense-and-avoid of obstacles in simulation.