Likelihood-free inference via classification
This work addresses a bottleneck in likelihood-free inference for researchers using intractable generative models, offering a practical solution by leveraging classification methods.
The paper tackles the challenge of performing statistical inference with complex generative models where likelihood evaluation is computationally prohibitive, by proposing a likelihood-free inference method that transforms the problem into classification between simulated and observed data, and validates it with simulations and real-world epidemiological data.
Increasingly complex generative models are being used across disciplines as they allow for realistic characterization of data, but a common difficulty with them is the prohibitively large computational cost to evaluate the likelihood function and thus to perform likelihood-based statistical inference. A likelihood-free inference framework has emerged where the parameters are identified by finding values that yield simulated data resembling the observed data. While widely applicable, a major difficulty in this framework is how to measure the discrepancy between the simulated and observed data. Transforming the original problem into a problem of classifying the data into simulated versus observed, we find that classification accuracy can be used to assess the discrepancy. The complete arsenal of classification methods becomes thereby available for inference of intractable generative models. We validate our approach using theory and simulations for both point estimation and Bayesian inference, and demonstrate its use on real data by inferring an individual-based epidemiological model for bacterial infections in child care centers.