CVAIOct 13, 2021

Considering user agreement in learning to predict the aesthetic quality

arXiv:2110.06956v1
Originality Incremental advance
AI Analysis

This work addresses the problem of subjective aesthetic ranking for image analysis applications, offering an incremental improvement by incorporating uncertainty estimation into existing methods.

The paper tackles the challenge of robustly ranking image aesthetic quality by predicting both mean opinion scores and standard deviations to account for user disagreement, and introduces a confidence interval ranking loss to focus on uncertain image-pairs, achieving state-of-the-art performance on AVA and TMGA datasets.

How to robustly rank the aesthetic quality of given images has been a long-standing ill-posed topic. Such challenge stems mainly from the diverse subjective opinions of different observers about the varied types of content. There is a growing interest in estimating the user agreement by considering the standard deviation of the scores, instead of only predicting the mean aesthetic opinion score. Nevertheless, when comparing a pair of contents, few studies consider how confident are we regarding the difference in the aesthetic scores. In this paper, we thus propose (1) a re-adapted multi-task attention network to predict both the mean opinion score and the standard deviation in an end-to-end manner; (2) a brand-new confidence interval ranking loss that encourages the model to focus on image-pairs that are less certain about the difference of their aesthetic scores. With such loss, the model is encouraged to learn the uncertainty of the content that is relevant to the diversity of observers' opinions, i.e., user disagreement. Extensive experiments have demonstrated that the proposed multi-task aesthetic model achieves state-of-the-art performance on two different types of aesthetic datasets, i.e., AVA and TMGA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes