CVDec 27, 2018

S4-Net: Geometry-Consistent Semi-Supervised Semantic Segmentation

Sinisa Stekovic, Friedrich Fraundorfer, Vincent Lepetit

arXiv:1812.10717v23.94 citations

Originality Incremental advance

AI Analysis

This addresses the problem of reducing annotation costs for semantic segmentation in rigid scenes, though it appears incremental as it applies existing semi-supervised techniques to a specific context.

The paper tackles semantic segmentation with limited manual annotations by enforcing geometric 3D consistency between multiple views, showing that one manually labeled image per scene can achieve high performance on the LabelFusion dataset.

We show that it is possible to learn semantic segmentation from very limited amounts of manual annotations, by enforcing geometric 3D constraints between multiple views. More exactly, image locations corresponding to the same physical 3D point should all have the same label. We show that introducing such constraints during learning is very effective, even when no manual label is available for a 3D point, and can be done simply by employing techniques from 'general' semi-supervised learning to the context of semantic segmentation. To demonstrate this idea, we use RGB-D image sequences of rigid scenes, for a 4-class segmentation problem derived from the ScanNet dataset. Starting from RGB-D sequences with a few annotated frames, we show that we can incorporate RGB-D sequences without any manual annotations to improve the performance, which makes our approach very convenient. Furthermore, we demonstrate our approach for semantic segmentation of objects on the LabelFusion dataset, where we show that one manually labeled image in a scene is sufficient for high performance on the whole scene.

View on arXiv PDF

Similar