MLLGJun 8, 2021

Intrinsic Dimension Estimation Using Wasserstein Distances

arXiv:2106.04018v228 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding low-dimensional structure in data for machine learning practitioners, with incremental improvements in estimation and GAN analysis.

The paper tackles the problem of estimating the intrinsic dimension of high-dimensional data from finite samples, introducing a new estimator with finite sample guarantees and applying it to derive sample complexity bounds for GANs that depend only on the intrinsic dimension.

It has long been thought that high-dimensional data encountered in many practical machine learning tasks have low-dimensional structure, i.e., the manifold hypothesis holds. A natural question, thus, is to estimate the intrinsic dimension of a given population distribution from a finite sample. We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees. We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending only on the intrinsic dimension of the data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes