Super-Samples from Kernel Herding
This provides a more efficient method for approximating distributions, such as Bayesian predictive distributions, which is incremental as it builds on existing herding techniques.
The paper tackles the problem of approximating probability density functions in continuous spaces by extending the herding algorithm using the kernel trick, resulting in a deterministic process called kernel herding that achieves an error rate of O(1/T) for expectations, which is faster than the O(1/√T) rate of i.i.d. random samples.
We extend the herding algorithm to continuous spaces by using the kernel trick. The resulting "kernel herding" algorithm is an infinite memory deterministic process that learns to approximate a PDF with a collection of samples. We show that kernel herding decreases the error of expectations of functions in the Hilbert space at a rate O(1/T) which is much faster than the usual O(1/pT) for iid random samples. We illustrate kernel herding by approximating Bayesian predictive distributions.