CVNov 18, 2021

Blind VQA on 360° Video via Progressively Learning from Pixels, Frames and Video

Li Yang, Mai Xu, Shengxi Li, Yichen Guo, Zulin Wang

arXiv:2111.09503v14.718 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses quality assessment for immersive multimedia systems, but it is incremental as it builds on existing BVQA approaches by incorporating a progressive paradigm.

The paper tackles blind visual quality assessment (BVQA) for 360° video by proposing ProVQA, a method that progressively learns from pixels, frames, and video to mimic human perception, and it significantly advances state-of-the-art performance on two datasets.

Blind visual quality assessment (BVQA) on 360{\textdegree} video plays a key role in optimizing immersive multimedia systems. When assessing the quality of 360{\textdegree} video, human tends to perceive its quality degradation from the viewport-based spatial distortion of each spherical frame to motion artifact across adjacent frames, ending with the video-level quality score, i.e., a progressive quality assessment paradigm. However, the existing BVQA approaches for 360{\textdegree} video neglect this paradigm. In this paper, we take into account the progressive paradigm of human perception towards spherical video quality, and thus propose a novel BVQA approach (namely ProVQA) for 360{\textdegree} video via progressively learning from pixels, frames and video. Corresponding to the progressive learning of pixels, frames and video, three sub-nets are designed in our ProVQA approach, i.e., the spherical perception aware quality prediction (SPAQ), motion perception aware quality prediction (MPAQ) and multi-frame temporal non-local (MFTN) sub-nets. The SPAQ sub-net first models the spatial quality degradation based on spherical perception mechanism of human. Then, by exploiting motion cues across adjacent frames, the MPAQ sub-net properly incorporates motion contextual information for quality assessment on 360{\textdegree} video. Finally, the MFTN sub-net aggregates multi-frame quality degradation to yield the final quality score, via exploring long-term quality correlation from multiple frames. The experiments validate that our approach significantly advances the state-of-the-art BVQA performance on 360{\textdegree} video over two datasets, the code of which has been public in \url{https://github.com/yanglixiaoshen/ProVQA.}

View on arXiv PDF Code

Similar