CVApr 8, 2021

Progressive Temporal Feature Alignment Network for Video Inpainting

arXiv:2104.03507v167 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of artifacts and misalignment in video inpainting for applications like video editing, but it is incremental as it builds on existing flow-based and temporal methods.

The paper tackles video inpainting by proposing a Progressive Temporal Feature Alignment Network that corrects spatial misalignment in temporal feature propagation, achieving state-of-the-art performance on DAVIS and FVI datasets.

Video inpainting aims to fill spatio-temporal "corrupted" regions with plausible content. To achieve this goal, it is necessary to find correspondences from neighbouring frames to faithfully hallucinate the unknown content. Current methods achieve this goal through attention, flow-based warping, or 3D temporal convolution. However, flow-based warping can create artifacts when optical flow is not accurate, while temporal convolution may suffer from spatial misalignment. We propose 'Progressive Temporal Feature Alignment Network', which progressively enriches features extracted from the current frame with the feature warped from neighbouring frames using optical flow. Our approach corrects the spatial misalignment in the temporal feature propagation stage, greatly improving visual quality and temporal consistency of the inpainted videos. Using the proposed architecture, we achieve state-of-the-art performance on the DAVIS and FVI datasets compared to existing deep learning approaches. Code is available at https://github.com/MaureenZOU/TSAM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes