CVMar 13, 2020

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?

arXiv:2003.06141v12.32 citations

Originality Synthesis-oriented

AI Analysis

This tackles the problem of temporal inconsistency in video super-resolution for applications requiring smooth visual output, but it appears incremental as it builds on existing deep learning methods.

The paper investigates whether video super-resolution methods optimized for frame-wise spatial quality (e.g., PSNR) also achieve good temporal consistency between frames, addressing the issue of flickering, and explores joint optimization of both metrics.

Recent advances of deep learning lead to great success of image and video super-resolution (SR) methods that are based on convolutional neural networks (CNN). For video SR, advanced algorithms have been proposed to exploit the temporal correlation between low-resolution (LR) video frames, and/or to super-resolve a frame with multiple LR frames. These methods pursue higher quality of super-resolved frames, where the quality is usually measured frame by frame in e.g. PSNR. However, frame-wise quality may not reveal the consistency between frames. If an algorithm is applied to each frame independently (which is the case of most previous methods), the algorithm may cause temporal inconsistency, which can be observed as flickering. It is a natural requirement to improve both frame-wise fidelity and between-frame consistency, which are termed spatial quality and temporal quality, respectively. Then we may ask, is a method optimized for spatial quality also optimized for temporal quality? Can we optimize the two quality metrics jointly?

View on arXiv PDF

Similar