Bayesian Anomaly Detection Using Extreme Value Theory
This addresses the practical limitation of threshold sensitivity in anomaly detection methods, which is an incremental improvement for data-driven applications.
The paper tackles the problem of setting thresholds in anomaly detection by proposing a probabilistic framework that models normal and anomalous behaviors using extreme value theory, resulting in a joint non-parametric clustering and anomaly detection algorithm based on a Dirichlet Process Mixture Model.
Data-driven anomaly detection methods typically build a model for the normal behavior of the target system, and score each data instance with respect to this model. A threshold is invariably needed to identify data instances with high (or low) scores as anomalies. This presents a practical limitation on the applicability of such methods, since most methods are sensitive to the choice of the threshold, and it is challenging to set optimal thresholds. We present a probabilistic framework to explicitly model the normal and anomalous behaviors and probabilistically reason about the data. An extreme value theory based formulation is proposed to model the anomalous behavior as the extremes of the normal behavior. As a specific instantiation, a joint non-parametric clustering and anomaly detection algorithm is proposed that models the normal behavior as a Dirichlet Process Mixture Model.