CVMar 28, 2018

Adversarial Spatio-Temporal Learning for Video Deblurring

arXiv:1804.00533v2297 citations
Originality Incremental advance
AI Analysis

This addresses blur in videos from hand-held cameras, offering an incremental improvement over existing methods.

The paper tackles video deblurring by proposing a network that models spatio-temporal characteristics using modified 3D convolutions and integrates it into a GAN framework with content and adversarial losses, achieving state-of-the-art performance on two standard benchmarks.

Camera shake or target movement often leads to undesired blur effects in videos captured by a hand-held camera. Despite significant efforts having been devoted to video-deblur research, two major challenges remain: 1) how to model the spatio-temporal characteristics across both the spatial domain (i.e., image plane) and temporal domain (i.e., neighboring frames), and 2) how to restore sharp image details w.r.t. the conventionally adopted metric of pixel-wise errors. In this paper, to address the first challenge, we propose a DeBLuRring Network (DBLRNet) for spatial-temporal learning by applying a modified 3D convolution to both spatial and temporal domains. Our DBLRNet is able to capture jointly spatial and temporal information encoded in neighboring frames, which directly contributes to improved video deblur performance. To tackle the second challenge, we leverage the developed DBLRNet as a generator in the GAN (generative adversarial network) architecture, and employ a content loss in addition to an adversarial loss for efficient adversarial training. The developed network, which we name as DeBLuRring Generative Adversarial Network (DBLRGAN), is tested on two standard benchmarks and achieves the state-of-the-art performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes