Learning from Distributions via Support Measure Machines
This provides a method for machine learning tasks where data is represented as distributions, such as in domain adaptation or structured data analysis, but it is incremental as it builds on existing kernel-based techniques.
The paper tackles the problem of discriminative learning on probability distributions instead of vectorial data by introducing support measure machines (SMMs), which generalize SVMs to operate on mean embeddings in RKHS, and demonstrates effectiveness on synthetic and real-world data.
This paper presents a kernel-based discriminative learning framework on probability measures. Rather than relying on large collections of vectorial training examples, our framework learns using a collection of probability distributions that have been constructed to meaningfully represent training data. By representing these probability distributions as mean embeddings in the reproducing kernel Hilbert space (RKHS), we are able to apply many standard kernel-based learning techniques in straightforward fashion. To accomplish this, we construct a generalization of the support vector machine (SVM) called a support measure machine (SMM). Our analyses of SMMs provides several insights into their relationship to traditional SVMs. Based on such insights, we propose a flexible SVM (Flex-SVM) that places different kernel functions on each training example. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our proposed framework.