MLJan 7, 2021
Copula Quadrant Similarity for Anomaly ScoresMatthew Davidow, David Matteson
Practical anomaly detection requires applying numerous approaches due to the inherent difficulty of unsupervised learning. Direct comparison between complex or opaque anomaly detection algorithms is intractable; we instead propose a framework for associating the scores of multiple methods. Our aim is to answer the question: how should one measure the similarity between anomaly scores generated by different methods? The scoring crux is the extremes, which identify the most anomalous observations. A pair of algorithms are defined here to be similar if they assign their highest scores to roughly the same small fraction of observations. To formalize this, we propose a measure based on extremal similarity in scoring distributions through a novel upper quadrant modeling approach, and contrast it with tail and other dependence measures. We illustrate our method with simulated and real experiments, applying spectral methods to cluster multiple anomaly detection methods and to contrast our similarity measure with others. We demonstrate that our method is able to detect the clusters of anomaly detection algorithms to achieve an accurate and robust ensemble algorithm.
MEOct 15, 2018
ABACUS: Unsupervised Multivariate Change Detection via Bayesian Source SeparationWenyu Zhang, Daniel Gilbert, David Matteson
Change detection involves segmenting sequential data such that observations in the same segment share some desired properties. Multivariate change detection continues to be a challenging problem due to the variety of ways change points can be correlated across channels and the potentially poor signal-to-noise ratio on individual channels. In this paper, we are interested in locating additive outliers (AO) and level shifts (LS) in the unsupervised setting. We propose ABACUS, Automatic BAyesian Changepoints Under Sparsity, a Bayesian source separation technique to recover latent signals while also detecting changes in model parameters. Multi-level sparsity achieves both dimension reduction and modeling of signal changes. We show ABACUS has competitive or superior performance in simulation studies against state-of-the-art change detection methods and established latent variable models. We also illustrate ABACUS on two real application, modeling genomic profiles and analyzing household electricity consumption.