CVJun 17, 2022

Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images

Jiyeon Han, Hwanil Choi, Yunjey Choi, Junho Kim, Jung-Woo Ha, Jaesik Choi

arXiv:2206.08549v219.139 citationsh-index: 85

Originality Incremental advance

AI Analysis

This work addresses a gap in image synthesis evaluation for researchers and practitioners by providing a tool to assess diversity at the image level, though it is incremental as it builds on existing feature space methods.

The authors tackled the problem of evaluating the rarity of individual images generated by AI models, proposing a new metric called 'rarity score' that quantifies uncommonness based on nearest-neighbor distances in feature space, and demonstrated its effectiveness in comparing generative models and datasets like CelebA-HQ and FFHQ.

Evaluation metrics in image synthesis play a key role to measure performances of generative models. However, most metrics mainly focus on image fidelity. Existing diversity metrics are derived by comparing distributions, and thus they cannot quantify the diversity or rarity degree of each generated image. In this work, we propose a new evaluation metric, called `rarity score', to measure the individual rarity of each image synthesized by generative models. We first show empirical observation that common samples are close to each other and rare samples are far from each other in nearest-neighbor distances of feature space. We then use our metric to demonstrate that the extent to which different generative models produce rare images can be effectively compared. We also propose a method to compare rarities between datasets that share the same concept such as CelebA-HQ and FFHQ. Finally, we analyze the use of metrics in different designs of feature spaces to better understand the relationship between feature spaces and resulting sparse images. Code will be publicly available online for the research community.

View on arXiv PDF

Similar