LG PR MLJun 2, 2020

A Randomized Algorithm to Reduce the Support of Discrete Measures

Francesco Cosentino, Harald Oberhauser, Alessandro Abate

arXiv:2006.01757v27.213 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a complexity reduction problem in computational mathematics and machine learning, with potential applications in data compression and optimization, though it appears incremental as it builds on existing geometric characterizations.

The paper tackles the problem of reducing the support size of discrete probability measures from N to n+1 atoms while preserving the mean for a set of n functions, and presents a randomized algorithm that achieves this reduction, showing benefits in synthetic and real-world data when N is much larger than n.

Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms and has the same mean when integrated against each of the $n$ functions. If $ N \gg n$ this results in a huge reduction of complexity. We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling". We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $N\gg n$ regime. A Python implementation is available at \url{https://github.com/FraCose/Recombination_Random_Algos}.

View on arXiv PDF Code

Similar