MLAPJul 3, 2017

People Mover's Distance: Class level geometry using fast pairwise data adaptive transportation costs

arXiv:1707.00514v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses similarity analysis for large-scale, high-dimensional survey data, such as comparing U.S. counties, but is incremental as it builds on existing earth mover's distance methods.

The paper tackles the problem of defining a network graph on large, non-i.i.d. class collections by developing an approximate earth mover's distance algorithm using data-adaptive transportation costs, applied to a U.S. survey to measure county similarities.

We address the problem of defining a network graph on a large collection of classes. Each class is comprised of a collection of data points, sampled in a non i.i.d. way, from some unknown underlying distribution. The application we consider in this paper is a large scale high dimensional survey of people living in the US, and the question of how similar or different are the various counties in which these people live. We use a co-clustering diffusion metric to learn the underlying distribution of people, and build an approximate earth mover's distance algorithm using this data adaptive transportation cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes