LGJun 24, 2014

Incremental Clustering: The Case for Extra Clusters

arXiv:1406.6398v168 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of handling large-scale data efficiently for data analysts, but it is incremental as it builds on existing clustering theory.

The paper tackles the problem of incremental clustering's inability to detect certain cluster structures compared to batch methods, proving that some structures are impossible to identify incrementally and showing that allowing extra clusters can overcome these limitations.

The explosion in the amount of data available for analysis often necessitates a transition from batch to incremental clustering methods, which process one element at a time and typically store only a small subset of the data. In this paper, we initiate the formal analysis of incremental clustering methods focusing on the types of cluster structure that they are able to detect. We find that the incremental setting is strictly weaker than the batch model, proving that a fundamental class of cluster structures that can readily be detected in the batch setting is impossible to identify using any incremental method. Furthermore, we show how the limitations of incremental clustering can be overcome by allowing additional clusters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes