CRAICYLGSep 5, 2025

Adversarial Augmentation and Active Sampling for Robust Cyber Anomaly Detection

arXiv:2509.04999v11 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the problem of scarce labeled data for cybersecurity anomaly detection, offering an incremental improvement over existing methods.

The paper tackles the challenge of detecting Advanced Persistent Threats (APTs) in cybersecurity by combining AutoEncoders with active learning to reduce labeling costs and improve accuracy, achieving substantial improvements in detection rates on real-world imbalanced data with APTs accounting for 0.004% of the dataset.

Advanced Persistent Threats (APTs) present a considerable challenge to cybersecurity due to their stealthy, long-duration nature. Traditional supervised learning methods typically require large amounts of labeled data, which is often scarce in real-world scenarios. This paper introduces a novel approach that combines AutoEncoders for anomaly detection with active learning to iteratively enhance APT detection. By selectively querying an oracle for labels on uncertain or ambiguous samples, our method reduces labeling costs while improving detection accuracy, enabling the model to effectively learn with minimal data and reduce reliance on extensive manual labeling. We present a comprehensive formulation of the Attention Adversarial Dual AutoEncoder-based anomaly detection framework and demonstrate how the active learning loop progressively enhances the model's performance. The framework is evaluated on real-world, imbalanced provenance trace data from the DARPA Transparent Computing program, where APT-like attacks account for just 0.004\% of the data. The datasets, which cover multiple operating systems including Android, Linux, BSD, and Windows, are tested in two attack scenarios. The results show substantial improvements in detection rates during active learning, outperforming existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes