CV ROAug 21, 2020

Towards Autonomous Driving: a Multi-Modal 360$^{\circ}$ Perception Proposal

Jorge Beltrán, Carlos Guindel, Irene Cortés, Alejandro Barrera, Armando Astudillo, Jesús Urdiales, Mario Álvarez, Farid Bekka, Vicente Milanés, Fernando García

arXiv:2008.09672v15.016 citations

Originality Incremental advance

AI Analysis

This work addresses perception challenges for autonomous driving systems, but it appears incremental as it combines existing methods like CNNs, PointNet, and Kalman filters in a novel sensor fusion configuration.

The paper tackles 3D object detection and tracking for autonomous vehicles by proposing a multi-modal 360-degree framework that integrates CNN-based instance segmentation, LiDAR-to-image association, PointNet for 3D bounding boxes, and Unscented Kalman Filter for tracking, resulting in accurate and reliable road environment detection as validated in real-world tests.

In this paper, a multi-modal 360$^{\circ}$ framework for 3D object detection and tracking for autonomous vehicles is presented. The process is divided into four main stages. First, images are fed into a CNN network to obtain instance segmentation of the surrounding road participants. Second, LiDAR-to-image association is performed for the estimated mask proposals. Then, the isolated points of every object are processed by a PointNet ensemble to compute their corresponding 3D bounding boxes and poses. Lastly, a tracking stage based on Unscented Kalman Filter is used to track the agents along time. The solution, based on a novel sensor fusion configuration, provides accurate and reliable road environment detection. A wide variety of tests of the system, deployed in an autonomous vehicle, have successfully assessed the suitability of the proposed perception stack in a real autonomous driving application.

View on arXiv PDF

Similar