DSCVLGCOMLNov 25, 2015

A Short Survey on Data Clustering Algorithms

arXiv:1511.09123v157 citations
Originality Synthesis-oriented
AI Analysis

It is a survey paper, summarizing existing work for researchers in data analytics.

This paper reviews state-of-the-art clustering algorithms, discussing their design concepts, methodologies, and evaluation metrics, and provides future insights.

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis. Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand. In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed. Advanced clustering algorithms are also discussed. After that, the existing clustering evaluation metrics are reviewed. A summary with future insights is provided at the end.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes