CV ROMar 6, 2021

A Simple and Efficient Multi-task Network for 3D Object Detection and Road Understanding

Di Feng, Yiyang Zhou, Chenfeng Xu, Masayoshi Tomizuka, Wei Zhan

arXiv:2103.04056v111.632 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for integrated perception in autonomous driving, though it is incremental as it combines existing techniques into a multi-task framework.

The authors tackled the problem of performing multiple perception tasks for autonomous driving, such as 3D object detection and road understanding, using a single multi-task network, achieving competitive accuracies compared to state-of-the-art methods.

Detecting dynamic objects and predicting static road information such as drivable areas and ground heights are crucial for safe autonomous driving. Previous works studied each perception task separately, and lacked a collective quantitative analysis. In this work, we show that it is possible to perform all perception tasks via a simple and efficient multi-task network. Our proposed network, LidarMTL, takes raw LiDAR point cloud as inputs, and predicts six perception outputs for 3D object detection and road understanding. The network is based on an encoder-decoder architecture with 3D sparse convolution and deconvolution operations. Extensive experiments verify the proposed method with competitive accuracies compared to state-of-the-art object detectors and other task-specific networks. LidarMTL is also leveraged for online localization. Code and pre-trained model have been made available at https://github.com/frankfengdi/LidarMTL.

View on arXiv PDF Code

Similar