PhaseNet for Video Frame Interpolation
This work addresses video frame interpolation for applications in video processing and computer vision, offering a robust solution for handling larger motions and difficult conditions, though it is incremental in improving upon existing phase-based methods.
The authors tackled the problem of video frame interpolation in challenging scenarios like lighting changes and motion blur, proposing PhaseNet, which uses a neural network decoder to estimate phase decomposition and shows superior performance compared to previous phase-based and deep learning methods on challenging datasets.
Most approaches for video frame interpolation require accurate dense correspondences to synthesize an in-between frame. Therefore, they do not perform well in challenging scenarios with e.g. lighting changes or motion blur. Recent deep learning approaches that rely on kernels to represent motion can only alleviate these problems to some extent. In those cases, methods that use a per-pixel phase-based motion representation have been shown to work well. However, they are only applicable for a limited amount of motion. We propose a new approach, PhaseNet, that is designed to robustly handle challenging scenarios while also coping with larger motion. Our approach consists of a neural network decoder that directly estimates the phase decomposition of the intermediate frame. We show that this is superior to the hand-crafted heuristics previously used in phase-based methods and also compares favorably to recent deep learning based approaches for video frame interpolation on challenging datasets.