PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
This provides a statistically rigorous, likelihood-free tool for researchers and practitioners to evaluate generative models, though it is incremental as it builds on existing distribution comparison techniques.
The authors tackled the problem of assessing generative model quality without likelihood assumptions by proposing PQMass, a method that divides sample space into regions and uses chi-squared tests to compare distributions, demonstrating effectiveness across various data modalities and dimensions.
We propose a likelihood-free method for comparing two distributions given samples from each, with the goal of assessing the quality of generative models. The proposed approach, PQMass, provides a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models. PQMass divides the sample space into non-overlapping regions and applies chi-squared tests to the number of data samples that fall within each region, giving a p-value that measures the probability that the bin counts derived from two sets of samples are drawn from the same multinomial distribution. PQMass does not depend on assumptions regarding the density of the true distribution, nor does it rely on training or fitting any auxiliary models. We evaluate PQMass on data of various modalities and dimensions, demonstrating its effectiveness in assessing the quality, novelty, and diversity of generated samples. We further show that PQMass scales well to moderately high-dimensional data and thus obviates the need for feature extraction in practical applications.