Null Hypothesis Test for Anomaly Detection
This provides a method for anomaly detection in physics or similar domains without relying on fixed anomaly score cuts, though it appears incremental as it builds on existing decorrelation techniques.
The paper tackles anomaly detection by extending Classification Without Labels with a hypothesis test that excludes the background-only hypothesis through statistical independence testing of dataset regions, showing excellent performance on the LHC Olympics dataset with robust results across different signal fractions.
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.