CVAug 25, 2021

Memory-Augmented Non-Local Attention for Video Super-Resolution

arXiv:2108.11048v144 citations
Originality Highly original
AI Analysis

This addresses the problem of generating high-fidelity videos from low-resolution inputs, particularly for videos with large motions, offering an incremental improvement over existing methods.

The paper tackles video super-resolution by introducing a cross-frame non-local attention mechanism that avoids frame alignment, making it robust to large motions, and a memory-augmented attention module to capture details beyond neighbor frames, achieving superior performance on large motion videos compared to state-of-the-art methods.

In this paper, we propose a novel video super-resolution method that aims at generating high-fidelity high-resolution (HR) videos from low-resolution (LR) ones. Previous methods predominantly leverage temporal neighbor frames to assist the super-resolution of the current frame. Those methods achieve limited performance as they suffer from the challenge in spatial frame alignment and the lack of useful information from similar LR neighbor frames. In contrast, we devise a cross-frame non-local attention mechanism that allows video super-resolution without frame alignment, leading to be more robust to large motions in the video. In addition, to acquire the information beyond neighbor frames, we design a novel memory-augmented attention module to memorize general video details during the super-resolution training. Experimental results indicate that our method can achieve superior performance on large motion videos comparing to the state-of-the-art methods without aligning frames. Our source code will be released.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes