HEP-EXJun 22, 2023
Triggering Dark Showers with Conditional Dual Auto-EncodersLuca Anzalone, Simranjit Singh Chhibra, Benedikt Maier et al.
We present a family of conditional dual auto-encoders (CoDAEs) for generic and model-independent new physics searches at colliders. New physics signals, which arise from new types of particles and interactions, are considered in our study as anomalies causing deviations in data with respect to expected background events. In this work, we perform a normal-only anomaly detection, which employs only background samples, to search for manifestations of a dark version of strong force applying (variational) auto-encoders on raw detector images, which are large and highly sparse, without leveraging any physics-based pre-processing or strong assumption on the signals. The proposed CoDAE has a dual-encoder design, which is general and can learn an auxiliary yet compact latent space through spatial conditioning, showing a neat improvement over competitive physics-based baselines and related approaches, therefore also reducing the gap with fully supervised models. It is the first time an unsupervised model is shown to exhibit excellent discrimination against multiple dark shower models, illustrating the suitability of this method as an accurate, fast, model-independent algorithm to deploy, e.g., in the real-time event triggering systems of Large Hadron Collider experiments such as ATLAS and CMS.
HEP-PHDec 13, 2024
Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle PhysicsOz Amram, Luca Anzalone, Joschka Birk et al.
Foundation models are deep learning models pre-trained on large amounts of data which are capable of generalizing to multiple datasets and/or downstream tasks. This work demonstrates how data collected by the CMS experiment at the Large Hadron Collider can be useful in pre-training foundation models for HEP. Specifically, we introduce the AspenOpenJets dataset, consisting of approximately 178M high $p_T$ jets derived from CMS 2016 Open Data. We show how pre-training the OmniJet-$α$ foundation model on AspenOpenJets improves performance on generative tasks with significant domain shift: generating boosted top and QCD jets from the simulated JetClass dataset. In addition to demonstrating the power of pre-training of a jet-based foundation model on actual proton-proton collision data, we provide the ML-ready derived AspenOpenJets dataset for further public use.
HEP-EXFeb 1, 2022
Improving Parametric Neural Networks for High-Energy Physics (and Beyond)Luca Anzalone, Tommaso Diotalevi, Daniele Bonacorsi
Signal-background classification is a central problem in High-Energy Physics (HEP), that plays a major role for the discovery of new fundamental particles. A recent method -- the Parametric Neural Network (pNN) -- leverages multiple signal mass hypotheses as an additional input feature to effectively replace a whole set of individual classifiers, each providing (in principle) the best response for the corresponding mass hypothesis. In this work we aim at deepening the understanding of pNNs in light of real-world usage. We discovered several peculiarities of parametric networks, providing intuition, metrics, and guidelines to them. We further propose an alternative parametrization scheme, resulting in a new parametrized neural network architecture: the AffinePNN; along with many other generally applicable improvements, like the balanced training procedure. Finally, we extensively and empirically evaluate our models on the HEPMASS dataset, along its imbalanced version (called HEPMASS-IMB) we provide here for the first time, to further validate our approach. Provided results are in terms of the impact of the proposed design decisions, classification performance, and interpolation capability, as well.