An Internal Learning Approach to Video Inpainting
This addresses video inpainting for media editing by offering a training-free, video-specific approach, though it is incremental as it builds on Deep Image Prior.
The paper tackled video inpainting by proposing an algorithm that hallucinates missing appearance and motion without prior training, achieving visually plausible results with long-term consistency.
We propose a novel video inpainting algorithm that simultaneously hallucinates missing appearance and motion (optical flow) information, building upon the recent 'Deep Image Prior' (DIP) that exploits convolutional network architectures to enforce plausible texture in static images. In extending DIP to video we make two important contributions. First, we show that coherent video inpainting is possible without a priori training. We take a generative approach to inpainting based on internal (within-video) learning without reliance upon an external corpus of visual data to train a one-size-fits-all model for the large space of general videos. Second, we show that such a framework can jointly generate both appearance and flow, whilst exploiting these complementary modalities to ensure mutual consistency. We show that leveraging appearance statistics specific to each video achieves visually plausible results whilst handling the challenging problem of long-term consistency.