Multiple-Instance Learning: Radon-Nikodym Approach to Distribution Regression Problem
This provides a theoretical and practical solution for distribution regression in fields like multiple-instance learning, though it appears incremental as it builds on existing mathematical frameworks.
The paper tackles the distribution regression problem, where a bag of observations maps to a single value, by transforming it into a random vector problem using distribution moments and applying Radon-Nikodym or least squares theory to estimate the regression function and obtain the probability distribution of outcomes.
For distribution regression problem, where a bag of $x$--observations is mapped to a single $y$ value, a one--step solution is proposed. The problem of random distribution to random value is transformed to random vector to random value by taking distribution moments of $x$ observations in a bag as random vector. Then Radon--Nikodym or least squares theory can be applied, what give $y(x)$ estimator. The probability distribution of $y$ is also obtained, what requires solving generalized eigenvalues problem, matrix spectrum (not depending on $x$) give possible $y$ outcomes and depending on $x$ probabilities of outcomes can be obtained by projecting the distribution with fixed $x$ value (delta--function) to corresponding eigenvector. A library providing numerically stable polynomial basis for these calculations is available, what make the proposed approach practical.