LGNov 26, 2025

SetAD: Semi-Supervised Anomaly Learning in Contextual Sets

Jianling Gao, Chongyang Tao, Xuelian Lin, Junfeng Liu, Shuai Ma

arXiv:2512.07863v11 citations

Originality Highly original

AI Analysis

This addresses the challenge of detecting anomalies in contextual groups for applications like fraud detection or system monitoring, representing a novel method rather than an incremental improvement.

The paper tackled the problem of semi-supervised anomaly detection by reframing it as a set-level task to capture contextual group interactions, resulting in SetAD, which significantly outperformed state-of-the-art models on 10 real-world datasets and showed improved performance with larger set sizes.

Semi-supervised anomaly detection (AD) has shown great promise by effectively leveraging limited labeled data. However, existing methods are typically structured around scoring individual points or simple pairs. Such {point- or pair-centric} view not only overlooks the contextual nature of anomalies, which are defined by their deviation from a collective group, but also fails to exploit the rich supervisory signals that can be generated from the combinatorial composition of sets. Consequently, such models struggle to exploit the high-order interactions within the data, which are critical for learning discriminative representations. To address these limitations, we propose SetAD, a novel framework that reframes semi-supervised AD as a Set-level Anomaly Detection task. SetAD employs an attention-based set encoder trained via a graded learning objective, where the model learns to quantify the degree of anomalousness within an entire set. This approach directly models the complex group-level interactions that define anomalies. Furthermore, to enhance robustness and score calibration, we propose a context-calibrated anomaly scoring mechanism, which assesses a point's anomaly score by aggregating its normalized deviations from peer behavior across multiple, diverse contextual sets. Extensive experiments on 10 real-world datasets demonstrate that SetAD significantly outperforms state-of-the-art models. Notably, we show that our model's performance consistently improves with increasing set size, providing strong empirical support for the set-based formulation of anomaly detection.

View on arXiv PDF

Similar