LG CVSep 4, 2023

Probabilistic Precision and Recall Towards Reliable Evaluation of Generative Models

arXiv:2309.01590v112.313 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for more robust evaluation metrics in generative modeling, which is crucial for researchers and practitioners in machine learning, though it is incremental as it builds on prior kNN-based approaches.

The paper tackled the problem of unreliable evaluation metrics for generative models' fidelity and diversity by identifying issues with existing kNN-based precision-recall metrics, such as susceptibility to outliers. They proposed novel probabilistic metrics, P-precision and P-recall, which were shown through experiments to provide more reliable estimates than existing methods.

Assessing the fidelity and diversity of the generative model is a difficult but important issue for technological advancement. So, recent papers have introduced k-Nearest Neighbor ($k$NN) based precision-recall metrics to break down the statistical distance into fidelity and diversity. While they provide an intuitive method, we thoroughly analyze these metrics and identify oversimplified assumptions and undesirable properties of kNN that result in unreliable evaluation, such as susceptibility to outliers and insensitivity to distributional changes. Thus, we propose novel metrics, P-precision and P-recall (PP\&PR), based on a probabilistic approach that address the problems. Through extensive investigations on toy experiments and state-of-the-art generative models, we show that our PP\&PR provide more reliable estimates for comparing fidelity and diversity than the existing metrics. The codes are available at \url{https://github.com/kdst-team/Probablistic_precision_recall}.

View on arXiv PDF Code

Similar