Neural Embedding: Learning the Embedding of the Manifold of Physics Data
This addresses the challenge of extracting physically meaningful representations from complex high-dimensional datasets in collider physics, offering a novel quantification method for anomaly detection.
The paper tackles the problem of embedding physics data manifolds with metric structure into lower-dimensional Euclidean or hyperbolic spaces, demonstrating that this approach learns latent structure in simulated Large Hadron Collider collisions and provides the first viable solution to quantify model-agnostic anomaly detection search capability.
In this paper, we present a method of embedding physics data manifolds with metric structure into lower dimensional spaces with simpler metrics, such as Euclidean and Hyperbolic spaces. We then demonstrate that it can be a powerful step in the data analysis pipeline for many applications. Using progressively more realistic simulated collisions at the Large Hadron Collider, we show that this embedding approach learns the underlying latent structure. With the notion of volume in Euclidean spaces, we provide for the first time a viable solution to quantifying the true search capability of model agnostic search algorithms in collider physics (i.e. anomaly detection). Finally, we discuss how the ideas presented in this paper can be employed to solve many practical challenges that require the extraction of physically meaningful representations from information in complex high dimensional datasets.