Practical Introduction to Clustering Data
This is an incremental tutorial for beginners in data analysis, offering practical guidance on clustering techniques.
The paper provides an introduction to clustering methods, presenting three basic algorithms (k-means, neighbor-based, and agglomerative clustering) with C source code examples to facilitate implementation.
Data clustering is an approach to seek for structure in sets of complex data, i.e., sets of "objects". The main objective is to identify groups of objects which are similar to each other, e.g., for classification. Here, an introduction to clustering is given and three basic approaches are introduced: the k-means algorithm, neighbour-based clustering, and an agglomerative clustering method. For all cases, C source code examples are given, allowing for an easy implementation.