LGITMLDec 4, 2016

Robust nonparametric nearest neighbor random process clustering

arXiv:1612.01103v34 citations
Originality Incremental advance
AI Analysis

This addresses clustering challenges in signal processing and data analysis for applications like human motion analysis, but it is incremental as it builds on existing methods with new theoretical guarantees.

The paper tackles clustering of noisy, finite-length observations from stationary ergodic random processes without prior knowledge of model statistics or cluster count, using algorithms based on L1-distance between power spectral densities, and proves high-probability success under noise, missing entries, and overlapping PSDs with sufficient observation length, showing NNPC outperforms state-of-the-art in human motion sequence clustering.

We consider the problem of clustering noisy finite-length observations of stationary ergodic random processes according to their generative models without prior knowledge of the model statistics and the number of generative models. Two algorithms, both using the $L^1$-distance between estimated power spectral densities (PSDs) as a measure of dissimilarity, are analyzed. The first one, termed nearest neighbor process clustering (NNPC), relies on partitioning the nearest neighbor graph of the observations via spectral clustering. The second algorithm, simply referred to as $k$-means (KM), consists of a single $k$-means iteration with farthest point initialization and was considered before in the literature, albeit with a different dissimilarity measure. We prove that both algorithms succeed with high probability in the presence of noise and missing entries, and even when the generative process PSDs overlap significantly, all provided that the observation length is sufficiently large. Our results quantify the tradeoff between the overlap of the generative process PSDs, the observation length, the fraction of missing entries, and the noise variance. Finally, we provide extensive numerical results for synthetic and real data and find that NNPC outperforms state-of-the-art algorithms in human motion sequence clustering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes