LGPRMLJun 2, 2020

A Randomized Algorithm to Reduce the Support of Discrete Measures

arXiv:2006.01757v213 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses a complexity reduction problem in computational mathematics and machine learning, with potential applications in data compression and optimization, though it appears incremental as it builds on existing geometric characterizations.

The paper tackles the problem of reducing the support size of discrete probability measures from N to n+1 atoms while preserving the mean for a set of n functions, and presents a randomized algorithm that achieves this reduction, showing benefits in synthetic and real-world data when N is much larger than n.

Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms and has the same mean when integrated against each of the $n$ functions. If $ N \gg n$ this results in a huge reduction of complexity. We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling". We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $N\gg n$ regime. A Python implementation is available at \url{https://github.com/FraCose/Recombination_Random_Algos}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes