ML LGAug 9, 2025

Statistical Inference for Autoencoder-based Anomaly Detection after Representation Learning-based Domain Adaptation

Tran Tuan Kiet, Nguyen Thang Loi, Vo Nguyen Le Duy

arXiv:2508.07049v14.51 citationsh-index: 12

Originality Incremental advance

AI Analysis

This addresses the need for statistically rigorous anomaly detection in domains with limited data, though it is incremental as it builds on existing selective inference methods.

The paper tackles the problem of uncertainty in anomaly detection after domain adaptation by proposing STAND-DA, a framework that computes valid p-values for detected anomalies and controls the false positive rate below a pre-specified level (e.g., 0.05), validated through experiments on synthetic and real-world datasets.

Anomaly detection (AD) plays a vital role across a wide range of domains, but its performance might deteriorate when applied to target domains with limited data. Domain Adaptation (DA) offers a solution by transferring knowledge from a related source domain with abundant data. However, this adaptation process can introduce additional uncertainty, making it difficult to draw statistically valid conclusions from AD results. In this paper, we propose STAND-DA -- a novel framework for statistically rigorous Autoencoder-based AD after Representation Learning-based DA. Built on the Selective Inference (SI) framework, STAND-DA computes valid $p$-values for detected anomalies and rigorously controls the false positive rate below a pre-specified level $α$ (e.g., 0.05). To address the computational challenges of applying SI to deep learning models, we develop the GPU-accelerated SI implementation, significantly enhancing both scalability and runtime performance. This advancement makes SI practically feasible for modern, large-scale deep architectures. Extensive experiments on synthetic and real-world datasets validate the theoretical results and computational efficiency of the proposed STAND-DA method.

View on arXiv PDF

Similar