LG AI MLJul 2, 2025

Enhanced Generative Model Evaluation with Clipped Density and Coverage

Nicolas Salvy, Hugues Talbot, Bertrand Thirion

arXiv:2507.01761v29.42 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses the challenge of evaluating generative model quality for critical applications, though it is incremental as it builds on existing fidelity and coverage concepts.

The paper tackles the problem of unreliable evaluation metrics for generative models by introducing Clipped Density and Clipped Coverage, which outperform existing methods in robustness, sensitivity, and interpretability.

Although generative models have made remarkable progress in recent years, their use in critical applications has been hindered by an inability to reliably evaluate the quality of their generated samples. Quality refers to at least two complementary concepts: fidelity and coverage. Current quality metrics often lack reliable, interpretable values due to an absence of calibration or insufficient robustness to outliers. To address these shortcomings, we introduce two novel metrics: Clipped Density and Clipped Coverage. By clipping individual sample contributions, as well as the radii of nearest neighbor balls for fidelity, our metrics prevent out-of-distribution samples from biasing the aggregated values. Through analytical and empirical calibration, these metrics demonstrate linear score degradation as the proportion of bad samples increases. Thus, they can be straightforwardly interpreted as equivalent proportions of good samples. Extensive experiments on synthetic and real-world datasets demonstrate that Clipped Density and Clipped Coverage outperform existing methods in terms of robustness, sensitivity, and interpretability when evaluating generative models.

View on arXiv PDF

Similar