CVMar 4, 2025

Unified Arbitrary-Time Video Frame Interpolation and Prediction

Xin Jin, Longhai Wu, Jie Chen, Ilhyun Cho, Cheul-Hee Hahm

arXiv:2503.02316v16.22 citationsh-index: 6Has CodeICASSP

Originality Incremental advance

AI Analysis

This work addresses the problem of synthesizing video frames for both interpolation and prediction tasks in computer vision, offering a unified approach that is incremental in combining existing methods.

The authors tackled the separate tasks of video frame interpolation and prediction by proposing a unified model that handles both arbitrary-time interpolation and prediction, achieving competitive results for interpolation and outperforming state-of-the-art methods for prediction.

Video frame interpolation and prediction aim to synthesize frames in-between and subsequent to existing frames, respectively. Despite being closely-related, these two tasks are traditionally studied with different model architectures, or same architecture but individually trained weights. Furthermore, while arbitrary-time interpolation has been extensively studied, the value of arbitrary-time prediction has been largely overlooked. In this work, we present uniVIP - unified arbitrary-time Video Interpolation and Prediction. Technically, we firstly extend an interpolation-only network for arbitrary-time interpolation and prediction, with a special input channel for task (interpolation or prediction) encoding. Then, we show how to train a unified model on common triplet frames. Our uniVIP provides competitive results for video interpolation, and outperforms existing state-of-the-arts for video prediction. Codes will be available at: https://github.com/srcn-ivl/uniVIP

View on arXiv PDF Code

Similar