STITLGMLNov 10, 2020

A Statistical Perspective on Coreset Density Estimation

arXiv:2011.04907v210 citations
Originality Incremental advance
AI Analysis

This work addresses a gap in coreset research by providing a statistical framework for density estimation, which is incremental as it builds on existing coreset methods to analyze their theoretical properties.

The paper tackles the problem of understanding the statistical performance of coresets for nonparametric density estimation, establishing the minimax rate for coreset-based estimators and showing that practical coreset kernel density estimators are near-minimax optimal over Hölder-smooth densities.

Coresets have emerged as a powerful tool to summarize data by selecting a small subset of the original observations while retaining most of its information. This approach has led to significant computational speedups but the performance of statistical procedures run on coresets is largely unexplored. In this work, we develop a statistical framework to study coresets and focus on the canonical task of nonparameteric density estimation. Our contributions are twofold. First, we establish the minimax rate of estimation achievable by coreset-based estimators. Second, we show that the practical coreset kernel density estimators are near-minimax optimal over a large class of Hölder-smooth densities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes