Task Agnostic Restoration of Natural Video Dynamics
This addresses a key issue in video processing for applications like restoration and translation, offering a task-agnostic solution without needing raw videos at test time, though it is incremental as it builds on existing methods for temporal consistency.
The paper tackles the problem of temporal inconsistencies in video restoration/translation tasks caused by frame-wise processing, proposing a framework that learns consistent motion dynamics from inconsistent videos to mitigate flicker while preserving perceptual quality, achieving state-of-the-art results on benchmark datasets DAVIS and videvo.net.
In many video restoration/translation tasks, image processing operations are naïvely extended to the video domain by processing each frame independently, disregarding the temporal connection of the video frames. This disregard for the temporal connection often leads to severe temporal inconsistencies. State-Of-The-Art (SOTA) techniques that address these inconsistencies rely on the availability of unprocessed videos to implicitly siphon and utilize consistent video dynamics to restore the temporal consistency of frame-wise processed videos which often jeopardizes the translation effect. We propose a general framework for this task that learns to infer and utilize consistent motion dynamics from inconsistent videos to mitigate the temporal flicker while preserving the perceptual quality for both the temporally neighboring and relatively distant frames without requiring the raw videos at test time. The proposed framework produces SOTA results on two benchmark datasets, DAVIS and videvo.net, processed by numerous image processing applications. The code and the trained models are available at \url{https://github.com/MKashifAli/TARONVD}.