On guiding video object segmentation
This work addresses video object segmentation for computer vision applications, presenting an incremental improvement over existing methods.
The paper tackles segmenting moving objects in unconstrained environments by using guided convolutional neural networks with foreground masks from independent algorithms, achieving results that outperform non-guided and top-performing methods on the DAVIS 2016 dataset.
This paper presents a novel approach for segmenting moving objects in unconstrained environments using guided convolutional neural networks. This guiding process relies on foreground masks from independent algorithms (i.e. state-of-the-art algorithms) to implement an attention mechanism that incorporates the spatial location of foreground and background to compute their separated representations. Our approach initially extracts two kinds of features for each frame using colour and optical flow information. Such features are combined following a multiplicative scheme to benefit from their complementarity. These unified colour and motion features are later processed to obtain the separated foreground and background representations. Then, both independent representations are concatenated and decoded to perform foreground segmentation. Experiments conducted on the challenging DAVIS 2016 dataset demonstrate that our guided representations not only outperform non-guided, but also recent and top-performing video object segmentation algorithms.