Quality Assessment of Free-viewpoint Videos by Quantifying the Elastic Changes of Multi-Scale Motion Trajectories
This addresses quality assessment for immersive applications like Free-viewpoint TV, though it is incremental as it improves upon existing metrics for a specific domain.
The paper tackles the problem of quality assessment for free-viewpoint videos, where existing metrics fail to handle non-uniform spatial-temporal distortions from viewpoint synthesis, by proposing EM-VQM which quantifies motion trajectory deformations and structural dissimilarity, achieving gains of 12.86% and 16.75% in correlation coefficients over state-of-the-art metrics on two datasets.
Virtual viewpoints synthesis is an essential process for many immersive applications including Free-viewpoint TV (FTV). A widely used technique for viewpoints synthesis is Depth-Image-Based-Rendering (DIBR) technique. However, such techniques may introduce challenging non-uniform spatial-temporal structure-related distortions. Most of the existing state-of-the-art quality metrics fail to handle these distortions, especially the temporal structure inconsistencies observed during the switch of different viewpoints. To tackle this problem, an elastic metric and multi-scale trajectory based video quality metric (EM-VQM) is proposed in this paper. Dense motion trajectory is first used as a proxy for selecting temporal sensitive regions, where local geometric distortions might significantly diminish the perceived quality. Afterwards, the amount of temporal structure inconsistencies and unsmooth viewpoints transitions are quantified by calculating 1) the amount of motion trajectory deformations with elastic metric and, 2) the spatial-temporal structural dissimilarity. According to the comprehensive experimental results on two FTV video datasets, the proposed metric outperforms the state-of-the-art metrics designed for free-viewpoint videos significantly and achieves a gain of 12.86% and 16.75% in terms of median Pearson linear correlation coefficient values on the two datasets compared to the best one, respectively.