DS CV LG CO MLNov 25, 2015

A Short Survey on Data Clustering Algorithms

arXiv:1511.09123v17.357 citations

Originality Synthesis-oriented

AI Analysis

It is a survey paper, summarizing existing work for researchers in data analytics.

This paper reviews state-of-the-art clustering algorithms, discussing their design concepts, methodologies, and evaluation metrics, and provides future insights.

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial analysis. Formally speaking, given a set of data instances, a clustering algorithm is expected to divide the set of data instances into the subsets which maximize the intra-subset similarity and inter-subset dissimilarity, where a similarity measure is defined beforehand. In this work, the state-of-the-arts clustering algorithms are reviewed from design concept to methodology; Different clustering paradigms are discussed. Advanced clustering algorithms are also discussed. After that, the existing clustering evaluation metrics are reviewed. A summary with future insights is provided at the end.

View on arXiv PDF

Similar