Multi-criteria Similarity-based Anomaly Detection using Pareto Depth Analysis
This addresses the challenge of multi-criteria anomaly detection for applications where prior knowledge of measure importance is lacking, offering a more efficient and effective solution.
The paper tackles the problem of anomaly detection when multiple dissimilarity measures are needed but their relative importance is unknown, proposing a method called Pareto depth analysis (PDA) that uses Pareto optimality to detect anomalies without requiring multiple runs with different weight choices, and it shows provable superiority over linear combinations and better performance in experiments.
We consider the problem of identifying patterns in a data set that exhibit anomalous behavior, often referred to as anomaly detection. Similarity-based anomaly detection algorithms detect abnormally large amounts of similarity or dissimilarity, e.g.~as measured by nearest neighbor Euclidean distances between a test sample and the training samples. In many application domains there may not exist a single dissimilarity measure that captures all possible anomalous patterns. In such cases, multiple dissimilarity measures can be defined, including non-metric measures, and one can test for anomalies by scalarizing using a non-negative linear combination of them. If the relative importance of the different dissimilarity measures are not known in advance, as in many anomaly detection applications, the anomaly detection algorithm may need to be executed multiple times with different choices of weights in the linear combination. In this paper, we propose a method for similarity-based anomaly detection using a novel multi-criteria dissimilarity measure, the Pareto depth. The proposed Pareto depth analysis (PDA) anomaly detection algorithm uses the concept of Pareto optimality to detect anomalies under multiple criteria without having to run an algorithm multiple times with different choices of weights. The proposed PDA approach is provably better than using linear combinations of the criteria and shows superior performance on experiments with synthetic and real data sets.