CRApr 29

Signal Decomposition Reveals Structure in Insider Threat Detection under Sparse Temporal Data

arXiv:2602.110191.4h-index: 1

AI Analysis

For insider threat detection practitioners, this work provides a method to handle sparse temporal data, but it is incremental as it applies existing decomposition and autoencoder techniques to a known problem.

The paper addresses the challenge of insider threat detection under sparse temporal data by decomposing activity into presence and magnitude, using a dual-channel autoencoder. Results show that short attacks are detected via presence, longer attacks via magnitude, and simple aggregation of extreme scores recovers extended activity without complex sequence modeling.

Insider threat detection is difficult because malicious behavior is rare, irregular, and buried in long periods of inactivity. In enterprise audit data, most windows contain little activity, while attacks appear intermittently and range from brief events to sustained campaigns. Standard reconstruction-based models are therefore dominated by inactive regions and tend to learn baseline behavior rather than meaningful deviations. We separate activity presence from magnitude. Each window is decomposed into a binary mask indicating whether activity occurs and a value matrix capturing its intensity. A dual-channel autoencoder reconstructs both, with value loss applied only where activity is present, directing learning toward sparse structure. Using the CERT r5.2 dataset as a controlled setting, we examine how anomaly signal changes with temporal configuration. Short attacks are detected mainly through presence; longer attacks introduce a magnitude component; noise degrades magnitude reliability and shifts detection back toward presence. The balance between channels is not fixed and follows the data. At the campaign level, signal concentrates in a small number of anomalous windows. Simple aggregation that emphasizes extreme scores is sufficient to recover extended activity without explicit sequence modeling. Effective detection depends less on model complexity and more on aligning representation and objective with sparse temporal structure.

View on arXiv PDF

Similar