LG IT STMay 26, 2016

Learning Multivariate Log-concave Distributions

Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

arXiv:1605.08188v29.732 citations

Originality Highly original

AI Analysis

This work addresses a fundamental problem in statistical learning theory for researchers and practitioners dealing with high-dimensional data, providing a near-optimal sample complexity result that was previously unknown for d > 3.

The paper tackles the problem of estimating multivariate log-concave probability density functions by proving the first sample complexity upper bound for all dimensions d ≥ 1, with an estimator requiring Õ_d((1/ε)^((d+5)/2)) samples to achieve ε-close accuracy in total variation distance, nearly matching the known lower bound of Ω_d((1/ε)^((d+1)/2)).

We study the problem of estimating multivariate log-concave probability density functions. We prove the first sample complexity upper bound for learning log-concave densities on $\mathbb{R}^d$, for all $d \geq 1$. Prior to our work, no upper bound on the sample complexity of this learning problem was known for the case of $d>3$. In more detail, we give an estimator that, for any $d \ge 1$ and $ε>0$, draws $\tilde{O}_d \left( (1/ε)^{(d+5)/2} \right)$ samples from an unknown target log-concave density on $\mathbb{R}^d$, and outputs a hypothesis that (with high probability) is $ε$-close to the target, in total variation distance. Our upper bound on the sample complexity comes close to the known lower bound of $Ω_d \left( (1/ε)^{(d+1)/2} \right)$ for this problem.

View on arXiv PDF

Similar