CVSep 13, 2023

Aggregating Nearest Sharp Features via Hybrid Transformers for Video Deblurring

Wei Shang, Dongwei Ren, Yi Yang, Wangmeng Zuo

arXiv:2309.07054v25.010 citationsh-index: 103Has Code

Originality Incremental advance

AI Analysis

This addresses video deblurring for real-world scenarios where sharp frames are present, offering incremental improvements over existing methods.

The paper tackles video deblurring by leveraging sharp frames interspersed in blurry videos, using hybrid Transformers to aggregate features from neighboring and sharp frames, achieving state-of-the-art performance on benchmark datasets with improved quantitative metrics and visual quality.

Video deblurring methods, aiming at recovering consecutive sharp frames from a given blurry video, usually assume that the input video suffers from consecutively blurry frames. However, in real-world scenarios captured by modern imaging devices, sharp frames often interspersed within the video, providing temporally nearest sharp features that can aid in the restoration of blurry frames. In this work, we propose a video deblurring method that leverages both neighboring frames and existing sharp frames using hybrid Transformers for feature aggregation. Specifically, we first train a blur-aware detector to distinguish between sharp and blurry frames. Then, a window-based local Transformer is employed for exploiting features from neighboring frames, where cross attention is beneficial for aggregating features from neighboring frames without explicit spatial alignment. To aggregate nearest sharp features from detected sharp frames, we utilize a global Transformer with multi-scale matching capability. Moreover, our method can easily be extended to event-driven video deblurring by incorporating an event fusion module into the global Transformer. Extensive experiments on benchmark datasets demonstrate that our proposed method outperforms state-of-the-art video deblurring methods as well as event-driven video deblurring methods in terms of quantitative metrics and visual quality. The source code and trained models are available at https://github.com/shangwei5/STGTN.

View on arXiv PDF Code

Similar