CVNov 7, 2023

Video Instance Matting

arXiv:2311.04212v211 citationsh-index: 11Has Code
Originality Highly original
AI Analysis

This work addresses the need for high-quality, instance-specific matting in videos, which is crucial for applications like video editing and visual effects, though it is incremental as it builds on video instance segmentation by refining masks into alpha mattes.

The paper tackles the problem of estimating alpha mattes for each instance in video sequences, where existing methods either produce a single matte per frame or binarized masks, and introduces MSG-VIM as a baseline model that significantly outperforms prior methods on the new VIM50 benchmark.

Conventional video matting outputs one alpha matte for all instances appearing in a video frame so that individual instances are not distinguished. While video instance segmentation provides time-consistent instance masks, results are unsatisfactory for matting applications, especially due to applied binarization. To remedy this deficiency, we propose Video Instance Matting~(VIM), that is, estimating alpha mattes of each instance at each frame of a video sequence. To tackle this challenging problem, we present MSG-VIM, a Mask Sequence Guided Video Instance Matting neural network, as a novel baseline model for VIM. MSG-VIM leverages a mixture of mask augmentations to make predictions robust to inaccurate and inconsistent mask guidance. It incorporates temporal mask and temporal feature guidance to improve the temporal consistency of alpha matte predictions. Furthermore, we build a new benchmark for VIM, called VIM50, which comprises 50 video clips with multiple human instances as foreground objects. To evaluate performances on the VIM task, we introduce a suitable metric called Video Instance-aware Matting Quality~(VIMQ). Our proposed model MSG-VIM sets a strong baseline on the VIM50 benchmark and outperforms existing methods by a large margin. The project is open-sourced at https://github.com/SHI-Labs/VIM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes