LG AI NEAug 26, 2025

Metric Matters: A Formal Evaluation of Similarity Measures in Active Learning for Cyber Threat Intelligence

arXiv:2508.19019v1h-index: 10SISAP

Originality Incremental advance

AI Analysis

This addresses the problem of stealthy APT detection for cyber defense practitioners, offering incremental improvements through metric evaluation in active learning.

The paper tackled the challenge of detecting Advanced Persistent Threats (APTs) in cyber defense by proposing an active learning framework that uses similarity search to improve anomaly detection. The results showed that the choice of similarity metric significantly impacts model convergence, accuracy, and label efficiency, with experiments conducted on datasets like DARPA Transparent Computing APT traces.

Advanced Persistent Threats (APTs) pose a severe challenge to cyber defense due to their stealthy behavior and the extreme class imbalance inherent in detection datasets. To address these issues, we propose a novel active learning-based anomaly detection framework that leverages similarity search to iteratively refine the decision space. Built upon an Attention-Based Autoencoder, our approach uses feature-space similarity to identify normal-like and anomaly-like instances, thereby enhancing model robustness with minimal oracle supervision. Crucially, we perform a formal evaluation of various similarity measures to understand their influence on sample selection and anomaly ranking effectiveness. Through experiments on diverse datasets, including DARPA Transparent Computing APT traces, we demonstrate that the choice of similarity metric significantly impacts model convergence, anomaly detection accuracy, and label efficiency. Our results offer actionable insights for selecting similarity functions in active learning pipelines tailored for threat intelligence and cyber defense.

View on arXiv PDF

Similar