RO CVJun 26, 2017

Deep Semantic Classification for 3D LiDAR Data

Ayush Dewan, Gabriel L. Oliveira, Wolfram Burgard

arXiv:1706.08355v118.258 citations

Originality Incremental advance

AI Analysis

This addresses the need for robots to understand dynamic environments, but it appears incremental as it builds on existing methods with a hybrid approach.

The paper tackles the problem of classifying 3D LiDAR data into non-movable, movable, and dynamic classes to enable autonomous robot operation in dynamic environments, reporting competitive results on a standard benchmark dataset and showing improvement by combining semantic and motion cues.

Robots are expected to operate autonomously in dynamic environments. Understanding the underlying dynamic characteristics of objects is a key enabler for achieving this goal. In this paper, we propose a method for pointwise semantic classification of 3D LiDAR data into three classes: non-movable, movable and dynamic. We concentrate on understanding these specific semantics because they characterize important information required for an autonomous system. Non-movable points in the scene belong to unchanging segments of the environment, whereas the remaining classes corresponds to the changing parts of the scene. The difference between the movable and dynamic class is their motion state. The dynamic points can be perceived as moving, whereas movable objects can move, but are perceived as static. To learn the distinction between movable and non-movable points in the environment, we introduce an approach based on deep neural network and for detecting the dynamic points, we estimate pointwise motion. We propose a Bayes filter framework for combining the learned semantic cues with the motion cues to infer the required semantic classification. In extensive experiments, we compare our approach with other methods on a standard benchmark dataset and report competitive results in comparison to the existing state-of-the-art. Furthermore, we show an improvement in the classification of points by combining the semantic cues retrieved from the neural network with the motion cues.

View on arXiv PDF

Similar