LG MLApr 20, 2013

Inverse Density as an Inverse Problem: The Fredholm Equation Approach

arXiv:1304.5575v221 citations

Originality Incremental advance

AI Analysis

This work addresses a fundamental problem in statistical inference and machine learning, such as covariate shift in transfer learning, but it is incremental as it builds on existing kernel and regularization techniques.

The paper tackles the problem of estimating the density ratio q/p, which is crucial for tasks like importance sampling and covariate shift, by reformulating it as a Fredholm integral equation and using kernel methods with regularization. The result is a flexible algorithm (FIRE) with theoretical convergence rates and experimental applications in classification and semi-supervised learning.

In this paper we address the problem of estimating the ratio $\frac{q}{p}$ where $p$ is a density function and $q$ is another density, or, more generally an arbitrary function. Knowing or approximating this ratio is needed in various problems of inference and integration, in particular, when one needs to average a function with respect to one probability distribution, given a sample from another. It is often referred as {\it importance sampling} in statistical inference and is also closely related to the problem of {\it covariate shift} in transfer learning as well as to various MCMC methods. It may also be useful for separating the underlying geometry of a space, say a manifold, from the density function defined on it. Our approach is based on reformulating the problem of estimating $\frac{q}{p}$ as an inverse problem in terms of an integral operator corresponding to a kernel, and thus reducing it to an integral equation, known as the Fredholm problem of the first kind. This formulation, combined with the techniques of regularization and kernel methods, leads to a principled kernel-based framework for constructing algorithms and for analyzing them theoretically. The resulting family of algorithms (FIRE, for Fredholm Inverse Regularized Estimator) is flexible, simple and easy to implement. We provide detailed theoretical analysis including concentration bounds and convergence rates for the Gaussian kernel in the case of densities defined on $\R^d$, compact domains in $\R^d$ and smooth $d$-dimensional sub-manifolds of the Euclidean space. We also show experimental results including applications to classification and semi-supervised learning within the covariate shift framework and demonstrate some encouraging experimental comparisons. We also show how the parameters of our algorithms can be chosen in a completely unsupervised manner.

View on arXiv PDF

Similar