CVROMar 5, 2021

Fine-Grained Off-Road Semantic Segmentation and Mapping via Contrastive Learning

arXiv:2103.03651v137 citations
Originality Incremental advance
AI Analysis

This work addresses the need for fine-grained scene understanding to support off-road robot navigation in diverse and complex environments, representing an incremental improvement over binary classification approaches.

The paper tackles the problem of fine-grained semantic segmentation for off-road robots by proposing a contrastive learning method to discriminate regions with different traversability, achieving an anchor accuracy of 89.8% in cross-scene validation and producing maps with fine-grained labels and confidence values.

Road detection or traversability analysis has been a key technique for a mobile robot to traverse complex off-road scenes. The problem has been mainly formulated in early works as a binary classification one, e.g. associating pixels with road or non-road labels. Whereas understanding scenes with fine-grained labels are needed for off-road robots, as scenes are very diverse, and the various mechanical performance of off-road robots may lead to different definitions of safe regions to traverse. How to define and annotate fine-grained labels to achieve meaningful scene understanding for a robot to traverse off-road is still an open question. This research proposes a contrastive learning based method. With a set of human-annotated anchor patches, a feature representation is learned to discriminate regions with different traversability, a method of fine-grained semantic segmentation and mapping is subsequently developed for off-road scene understanding. Experiments are conducted on a dataset of three driving segments that represent very diverse off-road scenes. An anchor accuracy of 89.8% is achieved by evaluating the matching with human-annotated image patches in cross-scene validation. Examined by associated 3D LiDAR data, the fine-grained segments of visual images are demonstrated to have different levels of toughness and terrain elevation, which represents their semantical meaningfulness. The resultant maps contain both fine-grained labels and confidence values, providing rich information to support a robot traversing complex off-road scenes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes