CVSep 10, 2021

Temporally Coherent Person Matting Trained on Fake-Motion Dataset

arXiv:2109.04843v1
Originality Incremental advance
AI Analysis

This addresses the challenge of generating stable mattes for video editing applications, though it is incremental as it builds on existing segmentation and U-Net methods.

The paper tackles the problem of video matting for people without requiring trimaps, achieving temporal stability in alpha mattes through a neural network architecture that uses motion-estimation smoothing and convolutional-LSTM modules.

We propose a novel neural-network-based method to perform matting of videos depicting people that does not require additional user input such as trimaps. Our architecture achieves temporal stability of the resulting alpha mattes by using motion-estimation-based smoothing of image-segmentation algorithm outputs, combined with convolutional-LSTM modules on U-Net skip connections. We also propose a fake-motion algorithm that generates training clips for the video-matting network given photos with ground-truth alpha mattes and background videos. We apply random motion to photos and their mattes to simulate movement one would find in real videos and composite the result with the background clips. It lets us train a deep neural network operating on videos in an absence of a large annotated video dataset and provides ground-truth training-clip foreground optical flow for use in loss functions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes