CVIVJun 9, 2019

Cross-view Semantic Segmentation for Sensing Surroundings

arXiv:1906.03560v3319 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of robot perception for spatial understanding, but it is incremental as it builds on existing domain adaptation and segmentation techniques.

The paper tackles the problem of enabling robots to perceive their surroundings by introducing a cross-view semantic segmentation task that parses first-view observations into top-down-view semantic maps, and presents a View Parsing Network (VPN) trained in a 3D graphics environment with domain adaptation for real-world data, showing effectiveness on synthetic and real-world agents and enabling surrounding sensing from 2D images on a LoCoBot robot.

Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at \url{https://view-parsing-network.github.io}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes