LGNov 10, 2025

Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time

arXiv:2511.07023v17 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the practical issue of distribution shifts in graph anomaly detection for real-world applications, offering a lightweight solution without retraining.

The paper tackles the problem of graph anomaly detection (GAD) models degrading in performance due to unseen normal samples during deployment, by proposing a test-time adaptation framework called TUNE that aligns shifted data and minimizes representation-level shifts, resulting in significant enhancements in generalizability across 10 real-world datasets.

Graph anomaly detection (GAD), which aims to detect outliers in graph-structured data, has received increasing research attention recently. However, existing GAD methods assume identical training and testing distributions, which is rarely valid in practice. In real-world scenarios, unseen but normal samples may emerge during deployment, leading to a normality shift that degrades the performance of GAD models trained on the original data. Through empirical analysis, we reveal that the degradation arises from (1) semantic confusion, where unseen normal samples are misinterpreted as anomalies due to their novel patterns, and (2) aggregation contamination, where the representations of seen normal nodes are distorted by unseen normals through message aggregation. While retraining or fine-tuning GAD models could be a potential solution to the above challenges, the high cost of model retraining and the difficulty of obtaining labeled data often render this approach impractical in real-world applications. To bridge the gap, we proposed a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns (TUNE) in GAD. To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level. Moreover, we utilize the minimization of representation-level shift as a supervision signal to train the aligner, which leverages the estimated aggregation contamination as a key indicator of normality shift. Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes