A Radon-Nikodým Perspective on Anomaly Detection: Theory and Implications
This provides a theoretical foundation for anomaly detection methods, benefiting domains like healthcare, cybersecurity, and finance, though it appears incremental as it builds on existing concepts like weighted loss and local outlier factor.
The paper tackled the problem of designing effective anomaly detection loss functions by introducing RN-Loss, which multiplies the vanilla loss with the Radon-Nikodým derivative, and showed that it outperforms state-of-the-art methods on 68% of multivariate datasets and achieves peak F1-scores on 72% of univariate time series datasets.
Which principle underpins the design of an effective anomaly detection loss function? The answer lies in the concept of Radon-Nikodým theorem, a fundamental concept in measure theory. The key insight from this article is: Multiplying the vanilla loss function with the Radon-Nikodým derivative improves the performance across the board. We refer to this as RN-Loss. We prove this using the setting of PAC (Probably Approximately Correct) learnability. Depending on the context a Radon-Nikodým derivative takes different forms. In the simplest case of supervised anomaly detection, Radon-Nikodým derivative takes the form of a simple weighted loss. In the case of unsupervised anomaly detection (with distributional assumptions), Radon-Nikodým derivative takes the form of the popular cluster based local outlier factor. We evaluate our algorithm on 96 datasets, including univariate and multivariate data from diverse domains, including healthcare, cybersecurity, and finance. We show that RN-Derivative algorithms outperform state-of-the-art methods on 68% of Multivariate datasets (based on F1 scores) and also achieves peak F1-scores on 72% of time series (Univariate) datasets.