CVJul 1, 2015

Beyond Semantic Image Segmentation : Exploring Efficient Inference in Video

arXiv:1507.01578v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses video semantic segmentation, an incremental improvement over existing image-based methods.

The paper tackles the problem of extending semantic segmentation from images to video by adapting CRF inference methods to handle video data efficiently, achieving inference over ten thousand images within seconds.

We explore the efficiency of the CRF inference module beyond image level semantic segmentation. The key idea is to combine the best of two worlds of semantic co-labeling and exploiting more expressive models. Similar to [Alvarez14] our formulation enables us perform inference over ten thousand images within seconds. On the other hand, it can handle higher-order clique potentials similar to [vineet2014] in terms of region-level label consistency and context in terms of co-occurrences. We follow the mean-field updates for higher order potentials similar to [vineet2014] and extend the spatial smoothness and appearance kernels [DenseCRF13] to address video data inspired by [Alvarez14]; thus making the system amenable to perform video semantic segmentation most effectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes