Proper losses for discrete generative models
This work addresses the challenge of reliable evaluation for generative models in machine learning, particularly for researchers and practitioners, but it is incremental as it builds on existing proper loss theory in a new context.
The paper tackles the problem of evaluating discrete generative models using proper losses that treat both the model and target distribution as black-boxes, only requiring i.i.d. samples, and shows that such losses must be polynomial in form with sample size constraints, while also constructing a loss based on cross-entropy via extended sampling schemes.
We initiate the study of proper losses for evaluating generative models in the discrete setting. Unlike traditional proper losses, we treat both the generative model and the target distribution as black-boxes, only assuming ability to draw i.i.d. samples. We define a loss to be black-box proper if the generative distribution that minimizes expected loss is equal to the target distribution. Using techniques from statistical estimation theory, we give a general construction and characterization of black-box proper losses: they must take a polynomial form, and the number of draws from the model and target distribution must exceed the degree of the polynomial. The characterization rules out a loss whose expectation is the cross-entropy between the target distribution and the model. By extending the construction to arbitrary sampling schemes such as Poisson sampling, however, we show that one can construct such a loss.