LGNov 4, 2021

ExClus: Explainable Clustering on Low-dimensional Data Representations

arXiv:2111.03168v11 citations
Originality Incremental advance
AI Analysis

This addresses the challenge for users in data analysis who need to understand clustering results from dimensionality-reduced visualizations, but it is incremental as it builds on existing clustering and explanation techniques.

The paper tackles the problem of interpreting cluster structures in low-dimensional data projections by proposing ExClus, a method that automatically computes interpretable clusterings with explanations in the original high-dimensional space, using a tunable greedy algorithm based on information theory, and experiments show it provides informative patterns with efficiency and scalability insights.

Dimensionality reduction and clustering techniques are frequently used to analyze complex data sets, but their results are often not easy to interpret. We consider how to support users in interpreting apparent cluster structure on scatter plots where the axes are not directly interpretable, such as when the data is projected onto a two-dimensional space using a dimensionality-reduction method. Specifically, we propose a new method to compute an interpretable clustering automatically, where the explanation is in the original high-dimensional space and the clustering is coherent in the low-dimensional projection. It provides a tunable balance between the complexity and the amount of information provided, through the use of information theory. We study the computational complexity of this problem and introduce restrictions on the search space of solutions to arrive at an efficient, tunable, greedy optimization algorithm. This algorithm is furthermore implemented in an interactive tool called ExClus. Experiments on several data sets highlight that ExClus can provide informative and easy-to-understand patterns, and they expose where the algorithm is efficient and where there is room for improvement considering tunability and scalability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes