MLLGHEP-EXHEP-PHJul 6, 2021

New Methods and Datasets for Group Anomaly Detection From Fundamental Physics

arXiv:2107.02821v116 citations
Originality Synthesis-oriented
AI Analysis

It addresses the lack of benchmark datasets for group anomaly detection, which is important for applications like fundamental physics, but is incremental in nature.

The paper tackles the problem of group anomaly detection by introducing a realistic synthetic benchmark dataset (LHCO2020) and comparing existing unsupervised techniques, showing performance improvements on this dataset.

The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we first explain how, after the Nobel-prize winning discovery of the Higgs boson, unsupervised group anomaly detection has become a new frontier of fundamental physics (where the motivation is to find new particles and forces). Then we propose a realistic synthetic benchmark dataset (LHCO2020) for the development of group anomaly detection algorithms. Finally, we compare several existing statistically-sound techniques for unsupervised group anomaly detection, and demonstrate their performance on the LHCO2020 dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes