CVDec 21, 2018

3DSRnet: Video Super-resolution using 3D Convolutional Neural Networks

arXiv:1812.09079v250 citations
AI Analysis

This work improves video super-resolution for practical applications by handling scene changes, though it is incremental as it builds on existing 3D-CNN methods.

The authors tackled video super-resolution by proposing 3DSRnet, a 3D-CNN that avoids motion alignment preprocessing and addresses scene change issues, achieving average gains of 0.45 dB and 0.36 dB in PSNR for scales 3 and 4 on the Vidset4 benchmark.

In video super-resolution, the spatio-temporal coherence between, and among the frames must be exploited appropriately for accurate prediction of the high resolution frames. Although 2D convolutional neural networks (CNNs) are powerful in modelling images, 3D-CNNs are more suitable for spatio-temporal feature extraction as they can preserve temporal information. To this end, we propose an effective 3D-CNN for video super-resolution, called the 3DSRnet that does not require motion alignment as preprocessing. Our 3DSRnet maintains the temporal depth of spatio-temporal feature maps to maximally capture the temporally nonlinear characteristics between low and high resolution frames, and adopts residual learning in conjunction with the sub-pixel outputs. It outperforms the most state-of-the-art method with average 0.45 and 0.36 dB higher in PSNR for scales 3 and 4, respectively, in the Vidset4 benchmark. Our 3DSRnet first deals with the performance drop due to scene change, which is important in practice but has not been previously considered.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes