CVApr 20, 2021

Flow-based Video Segmentation for Human Head and Shoulders

arXiv:2104.09752v111 citationsHas Code
AI Analysis

This work addresses segmentation challenges for videoconferencing and virtual reality applications, but it is incremental as it combines existing optical-flow techniques with neural networks.

The paper tackles the problem of real-time video segmentation for human head and shoulders in videoconferencing and virtual reality, addressing motion blur issues like head shaking or hand waving, and achieves robust segmentation with a proposed flow-based encoder-decoder network (FUNet).

Video segmentation for the human head and shoulders is essential in creating elegant media for videoconferencing and virtual reality applications. The main challenge is to process high-quality background subtraction in a real-time manner and address the segmentation issues under motion blurs, e.g., shaking the head or waving hands during conference video. To overcome the motion blur problem in video segmentation, we propose a novel flow-based encoder-decoder network (FUNet) that combines both traditional Horn-Schunck optical-flow estimation technique and convolutional neural networks to perform robust real-time video segmentation. We also introduce a video and image segmentation dataset: ConferenceVideoSegmentationDataset. Code and pre-trained models are available on our GitHub repository: \url{https://github.com/kuangzijian/Flow-Based-Video-Matting}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes