AILGDec 16, 2021

KnAC: an approach for enhancing cluster analysis with background knowledge and explanations

arXiv:2112.08759v213 citations
Originality Incremental advance
AI Analysis

This addresses the bottleneck of post-clustering analysis for domain experts, though it is an incremental improvement as it augments existing methods rather than introducing a new paradigm.

The authors tackled the problem of expert interpretation and conformance checking in clustering by introducing Knowledge Augmented Clustering (KnAC), which integrates background knowledge to refine expert labels and achieved better results than classic clustering methods in artificial and real-life scenarios.

Pattern discovery in multidimensional data sets has been the subject of research for decades. There exists a wide spectrum of clustering algorithms that can be used for this purpose. However, their practical applications share a common post-clustering phase, which concerns expert-based interpretation and analysis of the obtained results. We argue that this can be the bottleneck in the process, especially in cases where domain knowledge exists prior to clustering. Such a situation requires not only a proper analysis of automatically discovered clusters but also conformance checking with existing knowledge. In this work, we present Knowledge Augmented Clustering (KnAC). Its main goal is to confront expert-based labelling with automated clustering for the sake of updating and refining the former. Our solution is not restricted to any existing clustering algorithm. Instead, KnAC can serve as an augmentation of an arbitrary clustering algorithm, making the approach robust and a model-agnostic improvement of any state-of-the-art clustering method. We demonstrate the feasibility of our method on artificially, reproducible examples and in a real life use case scenario. In both cases, we achieved better results than classic clustering algorithms without augmentation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes