LG DBFeb 27, 2014

Outlier Detection using Improved Genetic K-means

arXiv:1402.6859v132 citations

Originality Synthesis-oriented

AI Analysis

This work addresses outlier detection for data analysis, but it appears incremental as it builds upon existing genetic k-means methods.

The paper tackles the problem of outlier detection and data clustering simultaneously by improving centroid estimation during clustering and outlier discovery, resulting in a two-stage algorithm that first applies an improved genetic k-means process and then iteratively removes distant vectors.

The outlier detection problem in some cases is similar to the classification problem. For example, the main concern of clustering-based outlier detection algorithms is to find clusters and outliers, which are often regarded as noise that should be removed in order to make more reliable clustering. In this article, we present an algorithm that provides outlier detection and data clustering simultaneously. The algorithmimprovesthe estimation of centroids of the generative distribution during the process of clustering and outlier discovery. The proposed algorithm consists of two stages. The first stage consists of improved genetic k-means algorithm (IGK) process, while the second stage iteratively removes the vectors which are far from their cluster centroids.

View on arXiv PDF

Similar