LGAIOct 6, 2021

Clustering Plotted Data by Image Segmentation

arXiv:2110.05187v1
Originality Incremental advance
AI Analysis

This provides a novel clustering approach for data analysis, offering speed and intuitive results, though it is incremental as it adapts existing segmentation techniques to a new application.

The paper tackles clustering in 2D data by using neural networks for image segmentation on plotted data, resulting in a method that is faster, hyperparameter-free, and aligns with human intuition, as demonstrated through comparisons with ten other methods on synthetic data.

Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar points. In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data: by training neural networks to perform instance segmentation on plotted data. Our approach, Visual Clustering, has several advantages over traditional clustering algorithms: it is much faster than most existing clustering algorithms (making it suitable for very large datasets), it agrees strongly with human intuition for clusters, and it is by default hyperparameter free (although additional steps with hyperparameters can be introduced for more control of the algorithm). We describe the method and compare it to ten other clustering methods on synthetic data to illustrate its advantages and disadvantages. We then demonstrate how our approach can be extended to higher dimensional data and illustrate its performance on real-world data. The implementation of Visual Clustering is publicly available and can be applied to any dataset in a few lines of code.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes