ML CO MENov 30, 2016

Likelihood-free inference by ratio estimation

Owen Thomas, Ritabrata Dutta, Jukka Corander, Samuel Kaski, Michael U. Gutmann

arXiv:1611.10242v626.0178 citations

Originality Incremental advance

AI Analysis

This method addresses the need for easier and less assumption-restricted inference in fields like computational statistics and machine learning, offering an incremental improvement over existing likelihood-free approaches.

The authors tackled the problem of parametric statistical inference when likelihood computations are expensive but sampling is possible, by developing a likelihood-free method that estimates the posterior via ratio estimation using logistic regression, enabling automatic selection of relevant summary statistics and demonstrating its application on challenging stochastic nonlinear dynamical systems with high-dimensional statistics.

We consider the problem of parametric statistical inference when likelihood computations are prohibitively expensive but sampling from the model is possible. Several so-called likelihood-free methods have been developed to perform inference in the absence of a likelihood function. The popular synthetic likelihood approach infers the parameters by modelling summary statistics of the data by a Gaussian probability distribution. In another popular approach called approximate Bayesian computation, the inference is performed by identifying parameter values for which the summary statistics of the simulated data are close to those of the observed data. Synthetic likelihood is easier to use as no measure of `closeness' is required but the Gaussianity assumption is often limiting. Moreover, both approaches require judiciously chosen summary statistics. We here present an alternative inference approach that is as easy to use as synthetic likelihood but not as restricted in its assumptions, and that, in a natural way, enables automatic selection of relevant summary statistic from a large set of candidates. The basic idea is to frame the problem of estimating the posterior as a problem of estimating the ratio between the data generating distribution and the marginal distribution. This problem can be solved by logistic regression, and including regularising penalty terms enables automatic selection of the summary statistics relevant to the inference task. We illustrate the general theory on canonical examples and employ it to perform inference for challenging stochastic nonlinear dynamical systems and high-dimensional summary statistics.

View on arXiv PDF

Similar