CVAug 24, 2020

TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module

arXiv:2008.10544v183 citations
Originality Incremental advance
AI Analysis

This work addresses scene understanding for robotics and autonomous driving, representing an incremental improvement over existing projection-based methods.

The paper tackles 3D LiDAR point cloud semantic segmentation by introducing TORNADO-Net, which uses multi-view projections, a diamond context block, and a loss combination including Total Variation, achieving state-of-the-art results on the SemanticKITTI dataset.

Semantic segmentation of point clouds is a key component of scene understanding for robotics and autonomous driving. In this paper, we introduce TORNADO-Net - a neural network for 3D LiDAR point cloud semantic segmentation. We incorporate a multi-view (bird-eye and range) projection feature extraction with an encoder-decoder ResNet architecture with a novel diamond context block. Current projection-based methods do not take into account that neighboring points usually belong to the same class. To better utilize this local neighbourhood information and reduce noisy predictions, we introduce a combination of Total Variation, Lovasz-Softmax, and Weighted Cross-Entropy losses. We also take advantage of the fact that the LiDAR data encompasses 360 degrees field of view and uses circular padding. We demonstrate state-of-the-art results on the SemanticKITTI dataset and also provide thorough quantitative evaluations and ablation results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes