Evaluation Metrics for Conditional Image Generation
This work addresses the need for better evaluation metrics in conditional image generation, which is incremental as it builds on existing unconditional metrics.
The paper tackles the problem of evaluating class-conditional image generation by proposing two new metrics that generalize the Inception Score and Frechet Inception Distance, with theoretical analysis and empirical evaluation showing their utility in analyzing model performance.
We present two new metrics for evaluating generative models in the class-conditional image generation setting. These metrics are obtained by generalizing the two most popular unconditional metrics: the Inception Score (IS) and the Fre'chet Inception Distance (FID). A theoretical analysis shows the motivation behind each proposed metric and links the novel metrics to their unconditional counterparts. The link takes the form of a product in the case of IS or an upper bound in the FID case. We provide an extensive empirical evaluation, comparing the metrics to their unconditional variants and to other metrics, and utilize them to analyze existing generative models, thus providing additional insights about their performance, from unlearned classes to mode collapse.