AIDec 19, 2023

Outlier detection using flexible categorisation and interrogative agendas

arXiv:2312.12010v26 citationsh-index: 28DSS
AI Analysis

This work addresses outlier detection in machine learning by incorporating epistemic stances into categorization, offering an incremental improvement with added interpretability.

The paper tackles outlier detection by introducing a method that uses flexible categorizations based on different sets of features (interrogative agendas), combining unsupervised and supervised algorithms. The result shows that these algorithms perform comparably to standard outlier detection methods on common datasets, while also providing explanations for their outcomes.

Categorization is one of the basic tasks in machine learning and data analysis. Building on formal concept analysis (FCA), the starting point of the present work is that different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them, and different such sets of features may yield better or worse categorizations, relative to the task at hand. In their turn, the (a priori) choice of a particular set of features over another might be subjective and express a certain epistemic stance (e.g. interests, relevance, preferences) of an agent or a group of agents, namely, their interrogative agenda. In the present paper, we represent interrogative agendas as sets of features, and explore and compare different ways to categorize objects w.r.t. different sets of features (agendas). We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas. We then present a supervised meta-learning algorithm to learn suitable (fuzzy) agendas for categorization as sets of features with different weights or masses. We combine this meta-learning algorithm with the unsupervised outlier detection algorithm to obtain a supervised outlier detection algorithm. We show that these algorithms perform at par with commonly used algorithms for outlier detection on commonly used datasets in outlier detection. These algorithms provide both local and global explanations of their results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes