CVLGFeb 14, 2024

Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation

arXiv:2402.08882v11 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This work addresses video object segmentation for computer vision applications, but it is incremental as it builds on existing methods by fine-tuning and combining them.

The paper tackles video object segmentation by proposing a neural network architecture that combines unsupervised optical flow estimation with a fully convolutional SegNet to generate moving object proposals, achieving state-of-the-art performance on the DAVIS dataset.

Dynamic scene understanding is one of the most conspicuous field of interest among computer vision community. In order to enhance dynamic scene understanding, pixel-wise segmentation with neural networks is widely accepted. The latest researches on pixel-wise segmentation combined semantic and motion information and produced good performance. In this work, we propose a state of art architecture of neural networks to accurately and efficiently get the moving object proposals (MOP). We first train an unsupervised convolutional neural network (UnFlow) to generate optical flow estimation. Then we render the output of optical flow net to a fully convolutional SegNet model. The main contribution of our work is (1) Fine-tuning the pretrained optical flow model on the brand new DAVIS Dataset; (2) Leveraging fully convolutional neural networks with Encoder-Decoder architecture to segment objects. We developed the codes with TensorFlow, and executed the training and evaluation processes on an AWS EC2 instance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes