CVNov 28, 2018

Future Segmentation Using 3D Structure

Suhani Vora, Reza Mahjourian, Soeren Pirk, Anelia Angelova

arXiv:1811.11358v15.29 citationsh-index: 33

Originality Incremental advance

AI Analysis

This work addresses a critical need for autonomous agents that rely on real-time visual data for decision-making, representing an incremental improvement in the field.

The paper tackles the problem of predicting future frame segmentation from monocular video by leveraging 3D scene structure, achieving state-of-the-art accuracy in future semantic segmentation.

Predicting the future to anticipate the outcome of events and actions is a critical attribute of autonomous agents; particularly for agents which must rely heavily on real time visual data for decision making. Working towards this capability, we address the task of predicting future frame segmentation from a stream of monocular video by leveraging the 3D structure of the scene. Our framework is based on learnable sub-modules capable of predicting pixel-wise scene semantic labels, depth, and camera ego-motion of adjacent frames. We further propose a recurrent neural network based model capable of predicting future ego-motion trajectory as a function of a series of past ego-motion steps. Ultimately, we observe that leveraging 3D structure in the model facilitates successful prediction, achieving state of the art accuracy in future semantic segmentation.

View on arXiv PDF

Similar