SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
This work addresses the challenge of integrating segmentation and flow estimation in videos, which is important for applications like video analysis and robotics, but it is incremental as it builds on existing models like FlowNet and fully convolutional networks.
The paper tackles the joint problem of video object segmentation and optical flow prediction by proposing SegFlow, an end-to-end network with bidirectional information propagation between branches, resulting in improved performance for both tasks compared to state-of-the-art methods.
This paper proposes an end-to-end trainable network, SegFlow, for simultaneously predicting pixel-wise object segmentation and optical flow in videos. The proposed SegFlow has two branches where useful information of object segmentation and optical flow is propagated bidirectionally in a unified framework. The segmentation branch is based on a fully convolutional network, which has been proved effective in image segmentation task, and the optical flow branch takes advantage of the FlowNet model. The unified framework is trained iteratively offline to learn a generic notion, and fine-tuned online for specific objects. Extensive experiments on both the video object segmentation and optical flow datasets demonstrate that introducing optical flow improves the performance of segmentation and vice versa, against the state-of-the-art algorithms.