ML LGJun 8, 2021

Intrinsic Dimension Estimation Using Wasserstein Distances

Adam Block, Zeyu Jia, Yury Polyanskiy, Alexander Rakhlin

arXiv:2106.04018v216.028 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of understanding low-dimensional structure in data for machine learning practitioners, with incremental improvements in estimation and GAN analysis.

The paper tackles the problem of estimating the intrinsic dimension of high-dimensional data from finite samples, introducing a new estimator with finite sample guarantees and applying it to derive sample complexity bounds for GANs that depend only on the intrinsic dimension.

It has long been thought that high-dimensional data encountered in many practical machine learning tasks have low-dimensional structure, i.e., the manifold hypothesis holds. A natural question, thus, is to estimate the intrinsic dimension of a given population distribution from a finite sample. We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees. We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending only on the intrinsic dimension of the data.

View on arXiv PDF

Similar