Video Frame Interpolation via Generalized Deformable Convolution
This work improves video frame interpolation for applications like video editing and compression, but it is incremental as it builds on existing deep learning approaches.
The paper tackles the problem of video frame interpolation by addressing limitations in existing flow-based and kernel-based methods, proposing a generalized deformable convolution mechanism that learns motion data-driven and selects sampling points freely, resulting in favorable performance against state-of-the-art methods, particularly for complex motions.
Video frame interpolation aims at synthesizing intermediate frames from nearby source frames while maintaining spatial and temporal consistencies. The existing deep-learning-based video frame interpolation methods can be roughly divided into two categories: flow-based methods and kernel-based methods. The performance of flow-based methods is often jeopardized by the inaccuracy of flow map estimation due to oversimplified motion models, while that of kernel-based methods tends to be constrained by the rigidity of kernel shape. To address these performance-limiting issues, a novel mechanism named generalized deformable convolution is proposed, which can effectively learn motion information in a data-driven manner and freely select sampling points in space-time. We further develop a new video frame interpolation method based on this mechanism. Our extensive experiments demonstrate that the new method performs favorably against the state-of-the-art, especially when dealing with complex motions.