MLITLGOct 26, 2020

Query Complexity of k-NN based Mode Estimation

arXiv:2010.13491v13 citations
Originality Incremental advance
AI Analysis

This addresses the mode estimation problem in machine learning and statistics, providing an efficient query-based approach for scenarios with noisy distance information, though it appears incremental as it builds on existing k-NN and oracle query frameworks.

The paper tackles the problem of identifying the point with the minimum k-th nearest neighbor distance in a dataset when pairwise distances are unknown but can be queried through a noisy oracle, proposing a sequential learning algorithm based on confidence intervals that solves the problem with high probability and shows significant improvement over baselines in numerical evaluations.

Motivated by the mode estimation problem of an unknown multivariate probability density function, we study the problem of identifying the point with the minimum k-th nearest neighbor distance for a given dataset of n points. We study the case where the pairwise distances are apriori unknown, but we have access to an oracle which we can query to get noisy information about the distance between any pair of points. For two natural oracle models, we design a sequential learning algorithm, based on the idea of confidence intervals, which adaptively decides which queries to send to the oracle and is able to correctly solve the problem with high probability. We derive instance-dependent upper bounds on the query complexity of our proposed scheme and also demonstrate significant improvement over the performance of other baselines via extensive numerical evaluations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes