LGDec 1, 2022

Clustering -- Basic concepts and methods

arXiv:2212.01248v12 citationsh-index: 22
Originality Synthesis-oriented
AI Analysis

This is an incremental review paper for beginners in data analysis, providing an introductory overview without novel contributions.

The paper reviews clustering as an analysis tool, covering basic concepts, methods, and validation techniques, without presenting new experimental results or numerical findings.

We review clustering as an analysis tool and the underlying concepts from an introductory perspective. What is clustering and how can clusterings be realised programmatically? How can data be represented and prepared for a clustering task? And how can clustering results be validated? Connectivity-based versus prototype-based approaches are reflected in the context of several popular methods: single-linkage, spectral embedding, k-means, and Gaussian mixtures are discussed as well as the density-based protocols (H)DBSCAN, Jarvis-Patrick, CommonNN, and density-peaks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes