IVCVOct 30, 2019

C3DVQA: Full-Reference Video Quality Assessment with 3D Convolutional Neural Network

arXiv:1910.13646v261 citations
Originality Incremental advance
AI Analysis

This work addresses video quality assessment for applications like streaming and broadcasting, but it is incremental as it builds on existing CNN methods with 3D kernels.

The paper tackled video quality assessment by proposing C3DVQA, a method using 3D convolutional neural networks to capture temporal masking effects, and it achieved state-of-the-art performance on LIVE and CSIQ datasets.

Traditional video quality assessment (VQA) methods evaluate localized picture quality and video score is predicted by temporally aggregating frame scores. However, video quality exhibits different characteristics from static image quality due to the existence of temporal masking effects. In this paper, we present a novel architecture, namely C3DVQA, that uses Convolutional Neural Network with 3D kernels (C3D) for full-reference VQA task. C3DVQA combines feature learning and score pooling into one spatiotemporal feature learning process. We use 2D convolutional layers to extract spatial features and 3D convolutional layers to learn spatiotemporal features. We empirically found that 3D convolutional layers are capable to capture temporal masking effects of videos. We evaluated the proposed method on the LIVE and CSIQ datasets. The experimental results demonstrate that the proposed method achieves the state-of-the-art performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes