Benjamin Nachman

HEP-PH
h-index120
52papers
3,755citations
Novelty43%
AI Score55

52 Papers

HEP-PHJun 17, 2022
Score-based Generative Models for Calorimeter Shower Simulation

Vinicius Mikuni, Benjamin Nachman

Score-based generative models are a new class of generative algorithms that have been shown to produce realistic images even in high dimensional spaces, currently surpassing other state-of-the-art models for different benchmark categories and applications. In this work we introduce CaloScore, a score-based generative model for collider physics applied to calorimeter shower generation. Three different diffusion models are investigated using the Fast Calorimeter Simulation Challenge 2022 dataset. CaloScore is the first application of a score-based generative model in collider physics and is able to produce high-fidelity calorimeter images for all datasets, providing an alternative paradigm for calorimeter shower simulation.

HEP-PHMar 15, 2022
New directions for surrogate models and differentiable programming for High Energy Physics detector simulation

Andreas Adelmann, Walter Hopkins, Evangelos Kourlitis et al.

The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, providing controllable and scalable simulation routines. In this document, new and ongoing efforts for surrogate models and differential programming applied to detector simulation are discussed in the context of the 2021 Particle Physics Community Planning Exercise (`Snowmass').

HEP-PHAug 23, 2023
Improving Generative Model-based Unfolding with Schrödinger Bridges

Sascha Diefenbacher, Guan-Horng Liu, Vinicius Mikuni et al.

Machine learning-based unfolding has enabled unbinned and high-dimensional differential cross section measurements. Two main approaches have emerged in this research area: one based on discriminative models and one based on generative models. The main advantage of discriminative models is that they learn a small correction to a starting simulation while generative models scale better to regions of phase space with little data. We propose to use Schroedinger Bridges and diffusion models to create SBUnfold, an unfolding approach that combines the strengths of both discriminative and generative models. The key feature of SBUnfold is that its generative model maps one set of events into another without having to go through a known probability density as is the case for normalizing flows and standard diffusion models. We show that SBUnfold achieves excellent performance compared to state of the art methods on a synthetic Z+jets dataset.

LGJul 10, 2023
Comparison of Point Cloud and Image-based Models for Calorimeter Fast Simulation

Fernando Torales Acosta, Vinicius Mikuni, Benjamin Nachman et al.

Score based generative models are a new class of generative models that have been shown to accurately generate high dimensional calorimeter datasets. Recent advances in generative models have used images with 3D voxels to represent and model complex calorimeter showers. Point clouds, however, are likely a more natural representation of calorimeter showers, particularly in calorimeters with high granularity. Point clouds preserve all of the information of the original simulation, more naturally deal with sparse datasets, and can be implemented with more compact models and data files. In this work, two state-of-the-art score based models are trained on the same set of calorimeter simulation and directly compared.

HEP-PHJun 6, 2023
High-dimensional and Permutation Invariant Anomaly Detection

Vinicius Mikuni, Benjamin Nachman

Methods for anomaly detection of new physics processes are often limited to low-dimensional spaces due to the difficulty of learning high-dimensional probability densities. Particularly at the constituent level, incorporating desirable properties such as permutation invariance and variable-length inputs becomes difficult within popular density estimation methods. In this work, we introduce a permutation-invariant density estimator for particle physics data based on diffusion models, specifically designed to handle variable-length inputs. We demonstrate the efficacy of our methodology by utilizing the learned density as a permutation-invariant anomaly detection score, effectively identifying jets with low likelihood under the background-only hypothesis. To validate our density estimation method, we investigate the ratio of learned densities and compare to those obtained by a supervised classification algorithm.

COApr 15
FAIR Universe Weak Lensing ML Uncertainty Challenge: Handling Uncertainties and Distribution Shifts for Precision Cosmology

Biwei Dai, Po-Wen Chang, Wahid Bhimji et al.

Weak gravitational lensing, the correlated distortion of background galaxy shapes by foreground structures, is a powerful probe of the matter distribution in our universe and allows accurate constraints on the cosmological model. In recent years, high-order statistics and machine learning (ML) techniques have been applied to weak lensing data to extract the nonlinear information beyond traditional two-point analysis. However, these methods typically rely on cosmological simulations, which poses several challenges: simulations are computationally expensive, limiting most realistic setups to a low training data regime; inaccurate modeling of systematics in the simulations create distribution shifts that can bias cosmological parameter constraints; and varying simulation setups across studies make method comparison difficult. To address these difficulties, we present the first weak lensing benchmark dataset with several realistic systematics and launch the FAIR Universe Weak Lensing Machine Learning Uncertainty Challenge. The challenge focuses on measuring the fundamental properties of the universe from weak lensing data with limited training set and potential distribution shifts, while providing a standardized benchmark for rigorous comparison across methods. Organized in two phases, the challenge will bring together the physics and ML communities to advance the methodologies for handling systematic uncertainties, data efficiency, and distribution shifts in weak lensing analysis with ML, ultimately facilitating the deployment of ML approaches into upcoming weak lensing survey analysis.

HEP-PHOct 20, 2022
Machine-Learning Compression for Particle Physics Discoveries

Jack H. Collins, Yifeng Huang, Simon Knapen et al.

In collider-based particle and nuclear physics experiments, data are produced at such extreme rates that only a subset can be recorded for later analysis. Typically, algorithms select individual collision events for preservation and store the complete experimental response. A relatively new alternative strategy is to additionally save a partial record for a larger subset of events, allowing for later specific analysis of a larger fraction of events. We propose a strategy that bridges these paradigms by compressing entire events for generic offline analysis but at a lower fidelity. An optimal-transport-based $β$ Variational Autoencoder (VAE) is used to automate the compression and the hyperparameter $β$ controls the compression fidelity. We introduce a new approach for multi-objective learning functions by simultaneously learning a VAE appropriate for all values of $β$ through parameterization. We present an example use case, a di-muon resonance search at the Large Hadron Collider (LHC), where we show that simulated data compressed by our $β$-VAE has enough fidelity to distinguish distinct signal morphologies.

HEP-PHDec 20, 2022
Resonant Anomaly Detection with Multiple Reference Datasets

Mayee F. Chen, Benjamin Nachman, Frederic Sala

An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) and Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single reference dataset. They cannot take advantage of commonly-available multiple datasets and thus cannot fully exploit available information. In this work, we propose generalizations of CWoLa and SALAD for settings where multiple reference datasets are available, building on weak supervision techniques. We demonstrate improved performance in a number of settings with realistic and synthetic data. As an added benefit, our generalizations enable us to provide finite-sample guarantees, improving on existing asymptotic analyses.

HEP-PHSep 12, 2023
Flows for Flows: Morphing one Dataset into another with Maximum Likelihood Estimation

Tobias Golling, Samuel Klein, Radha Mastandrea et al.

Many components of data analysis in high energy physics and beyond require morphing one dataset into another. This is commonly solved via reweighting, but there are many advantages of preserving weights and shifting the data points instead. Normalizing flows are machine learning models with impressive precision on a variety of particle physics tasks. Naively, normalizing flows cannot be used for morphing because they require knowledge of the probability density of the starting dataset. In most cases in particle physics, we can generate more examples, but we do not know densities explicitly. We propose a protocol called flows for flows for training normalizing flows to morph one dataset into another even if the underlying probability density of neither dataset is known explicitly. This enables a morphing strategy trained with maximum likelihood estimation, a setup that has been shown to be highly effective in related tasks. We study variations on this protocol to explore how far the data points are moved to statistically match the two datasets. Furthermore, we show how to condition the learned flows on particular features in order to create a morphing function for every value of the conditioning feature. For illustration, we demonstrate flows for flows for toy examples as well as a collider physics example involving dijet events

INS-DETOct 2, 2023
The Optimal use of Segmentation for Sampling Calorimeters

Fernando Torales Acosta, Bishnu Karki, Piyush Karande et al.

One of the key design choices of any sampling calorimeter is how fine to make the longitudinal and transverse segmentation. To inform this choice, we study the impact of calorimeter segmentation on energy reconstruction. To ensure that the trends are due entirely to hardware and not to a sub-optimal use of segmentation, we deploy deep neural networks to perform the reconstruction. These networks make use of all available information by representing the calorimeter as a point cloud. To demonstrate our approach, we simulate a detector similar to the forward calorimeter system intended for use in the ePIC detector, which will operate at the upcoming Electron Ion Collider. We find that for the energy estimation of isolated charged pion showers, relatively fine longitudinal segmentation is key to achieving an energy resolution that is better than 10% across the full phase space. These results provide a valuable benchmark for ongoing EIC detector optimizations and may also inform future studies involving high-granularity calorimeters in other experiments at various facilities.

HEP-PHApr 8, 2022
Exploring the Universality of Hadronic Jet Classification

Kingman Cheung, Yi-Lun Chung, Shih-Chieh Hsu et al.

The modeling of jet substructure significantly differs between Parton Shower Monte Carlo (PSMC) programs. Despite this, we observe that machine learning classifiers trained on different PSMCs learn nearly the same function. This means that when these classifiers are applied to the same PSMC for testing, they result in nearly the same performance. This classifier universality indicates that a machine learning model trained on one simulation and tested on another simulation (or data) will likely be optimal. Our observations are based on detailed studies of shallow and deep neural networks applied to simulated Lorentz boosted Higgs jet tagging at the LHC.

HEP-EXJan 23
EveNet: A Foundation Model for Particle Collision Data Analysis

Ting-Hsiang Hsu, Bai-Hong Zhou, Qibin Liu et al.

While deep learning is transforming data analysis in high-energy physics, computational challenges limit its potential. We address these challenges in the context of collider physics by introducing EveNet, an event-level foundation model pretrained on 500 million simulated collision events using a hybrid objective of self-supervised learning and physics-informed supervision. By leveraging a shared particle-cloud representation, EveNet outperforms state-of-the-art baselines across diverse tasks, including searches for heavy resonances and exotic Higgs decays, and demonstrates exceptional data efficiency in low-statistics regimes. Crucially, we validate the transferability of the model to experimental data by rediscovering the $Υ$ meson in CMS Open Data and show its capacity for precision physics through the robust extraction of quantum correlation observables stable against systematic uncertainties. These results indicate that EveNet can successfully encode the fundamental physical structure of particle interactions, which offers a unified and resource-efficient framework to accelerate discovery at current and future colliders.

HEP-EXApr 14
Cross-Domain Transfer with Particle Physics Foundation Models: From Jets to Neutrino Interactions

Gregor Krzmanc, Vinicius Mikuni, Benjamin Nachman et al.

Future AI-based studies in particle physics will likely start from a foundation model to accelerate training and enhance sensitivity. As a step towards a general-purpose foundation model for particle physics, we investigate whether the OmniLearned foundation model pre-trained on diverse high-$Q^2$ simulated and real $pp$ and $ep$ collisions can be effectively transferred to a few-GeV fixed-target neutrino experiment. We process MINERvA neutrino--nucleus scattering events and evaluate pre-trained models on two types of tasks: regression of available energy and binary classification of charged-current pion final states ($\mathrm{CC1π^{\pm}}$, $\mathrm{CCNπ^{\pm}}$, and $\mathrm{CC1π^{0}}$). Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps. These results suggest that particle-level foundation models acquire inductive biases that generalize across large differences in energy scale, detector technology, and underlying physics processes, pointing toward a paradigm of detector-agnostic inference in particle physics.

INS-DETMar 27
Generalizable Foundation Models for Calorimetry via Mixtures-of-Experts and Parameter Efficient Fine Tuning

Carlos Cardona-Giraldo, Cristiano Fanelli, James Giroux et al.

Modern particle physics experiments face an increasing demand for high-fidelity detector simulation as luminosities rise and computational requirements approach the limits of available resources. Deep generative models have emerged as promising surrogates for traditional Monte Carlo simulation, with recent advances drawing inspiration from large language models (LLM) and next-token prediction paradigms. In this work, we introduce a generalizable foundation model for calorimetry built on next-token transformer backbones, designed to support modular adaptation across materials, particle species, and detector configurations. Our approach combines Mixture-of-Experts pre-training with parameter-efficient fine-tuning strategies to enable controlled, additive model expansion without catastrophic forgetting. A pre-trained backbone is trained to generate electromagnetic showers across multiple absorber materials, while new materials are incorporated through the addition and tuning of lightweight expert modules. Extensions to new particle types are achieved via parameter-efficient fine-tuning and modular vocabularies, preserving the integrity of the base model. This design enables efficient, incremental knowledge integration as new simulation datasets become available, a critical requirement in realistic detector-development workflows. In addition, we demonstrate that next-token calorimeter models are computationally competitive with standard generative approaches under established LLM optimization procedures. These results establish next-token architectures as a viable path toward extensible, physics-aware foundation models for calorimetry and future high-energy physics experiments.

DATA-ANOct 12, 2023
Designing Observables for Measurements with Deep Learning

Owen Long, Benjamin Nachman

Many analyses in particle and nuclear physics use simulations to infer fundamental, effective, or phenomenological parameters of the underlying physics models. When the inference is performed with unfolded cross sections, the observables are designed using physics intuition and heuristics. We propose to design targeted observables with machine learning. Unfolded, differential cross sections in a neural network output contain the most information about parameters of interest and can be well-measured by construction. The networks are trained using a custom loss function that rewards outputs that are sensitive to the parameter(s) of interest while simultaneously penalizing outputs that are different between particle-level and detector-level (to minimize detector distortions). We demonstrate this idea in simulation using two physics models for inclusive measurements in deep inelastic scattering. We find that the new approach is more sensitive than classical observables at distinguishing the two models and also has a reduced unfolding uncertainty due to the reduced detector distortions.

HEP-PHNov 3, 2025
SEAL - A Symmetry EncourAging Loss for High Energy Physics

Pradyun Hebbar, Thandikire Madula, Vinicius Mikuni et al.

Physical symmetries provide a strong inductive bias for constructing functions to analyze data. In particular, this bias may improve robustness, data efficiency, and interpretability of machine learning models. However, building machine learning models that explicitly respect symmetries can be difficult due to the dedicated components required. Moreover, real-world experiments may not exactly respect fundamental symmetries at the level of finite granularities and energy thresholds. In this work, we explore an alternative approach to create symmetry-aware machine learning models. We introduce soft constraints that allow the model to decide the importance of added symmetries during the learning process instead of enforcing exact symmetries. We investigate two complementary approaches, one that penalizes the model based on specific transformations of the inputs and one inspired by group theory and infinitesimal transformations of the inputs. Using top quark jet tagging and Lorentz equivariance as examples, we observe that the addition of the soft constraints leads to more robust performance while requiring negligible changes to current state-of-the-art models.

INS-DETOct 28, 2024
CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation

Claudius Krause, Michele Faucci Giannelli, Gregor Kasieczka et al.

We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoders (VAEs), Generative Adversarial Networks (GANs), Normalizing Flows, Diffusion models, and models based on Conditional Flow Matching. We compare all submissions in terms of quality of generated calorimeter showers, as well as shower generation time and model size. To assess the quality we use a broad range of different metrics including differences in 1-dimensional histograms of observables, KPD/FPD scores, AUCs of binary classifiers, and the log-posterior of a multiclass classifier. The results of the CaloChallenge provide the most complete and comprehensive survey of cutting-edge approaches to calorimeter fast simulation to date. In addition, our work provides a uniquely detailed perspective on the important problem of how to evaluate generative models. As such, the results presented here should be applicable for other domains that use generative AI and require fast and faithful generation of samples in a large phase space.

HEP-PHApr 29, 2024
The Landscape of Unfolding with Machine Learning

Nathan Huetsch, Javier Mariño Villadamigo, Alexander Shmakov et al.

Recent innovations from machine learning allow for data unfolding, without binning and including correlations across many dimensions. We describe a set of known, upgraded, and new methods for ML-based unfolding. The performance of these approaches are evaluated on the same two datasets. We find that all techniques are capable of accurately reproducing the particle-level spectra across complex observables. Given that these approaches are conceptually diverse, they offer an exciting toolkit for a new class of measurements that can probe the Standard Model with an unprecedented level of detail and may enable sensitivity to new phenomena.

HEP-PHNov 4, 2024
Generative Unfolding with Distribution Mapping

Anja Butter, Sascha Diefenbacher, Nathan Huetsch et al.

Machine learning enables unbinned, highly-differential cross section measurements. A recent idea uses generative models to morph a starting simulation into the unfolded data. We show how to extend two morphing techniques, Schrödinger Bridges and Direct Diffusion, in order to ensure that the models learn the correct conditional probabilities. This brings distribution mapping to a similar level of accuracy as the state-of-the-art conditional generative unfolding methods. Numerical results are presented with a standard benchmark dataset of single jet substructure as well as for a new dataset describing a 22-dimensional phase space of Z + 2-jets.

HEP-PHMay 30, 2025
Generator Based Inference (GBI)

Chi Lung Cheng, Ranit Das, Runze Li et al.

Statistical inference in physics is often based on samples from a generator (sometimes referred to as a ``forward model") that emulate experimental data and depend on parameters of the underlying theory. Modern machine learning has supercharged this workflow to enable high-dimensional and unbinned analyses to utilize much more information than ever before. We propose a general framework for describing the integration of machine learning with generators called Generator Based Inference (GBI). A well-studied special case of this setup is Simulation Based Inference (SBI) where the generator is a physics-based simulator. In this work, we examine other methods within the GBI toolkit that use data-driven methods to build the generator. In particular, we focus on resonant anomaly detection, where the generator describing the background is learned from sidebands. We show how to perform machine learning-based parameter estimation in this context with data-derived generators. This transforms the statistical outputs of anomaly detection to be directly interpretable and the performance on the LHCO community benchmark dataset establishes a new state-of-the-art for anomaly detection sensitivity.

HEP-PHMar 11, 2025
Discriminative versus Generative Approaches to Simulation-based Inference

Benjamin Sluijter, Sascha Diefenbacher, Wahid Bhimji et al.

Most of the fundamental, emergent, and phenomenological parameters of particle and nuclear physics are determined through parametric template fits. Simulations are used to populate histograms which are then matched to data. This approach is inherently lossy, since histograms are binned and low-dimensional. Deep learning has enabled unbinned and high-dimensional parameter estimation through neural likelihiood(-ratio) estimation. We compare two approaches for neural simulation-based inference (NSBI): one based on discriminative learning (classification) and one based on generative modeling. These two approaches are directly evaluated on the same datasets, with a similar level of hyperparameter optimization in both cases. In addition to a Gaussian dataset, we study NSBI using a Higgs boson dataset from the FAIR Universe Challenge. We find that both the direct likelihood and likelihood ratio estimation are able to effectively extract parameters with reasonable uncertainties. For the numerical examples and within the set of hyperparameters studied, we found that the likelihood ratio method is more accurate and/or precise. Both methods have a significant spread from the network training and would require ensembling or other mitigation strategies in practice.

GAMay 5, 2023
Weakly-Supervised Anomaly Detection in the Milky Way

Mariel Pettee, Sowmya Thanvantri, Benjamin Nachman et al.

Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satellite. CWoLa operates without the use of labeled streams or knowledge of astrophysical principles. Instead, we train a classifier to distinguish between mixed samples for which the proportions of signal and background samples are unknown. This computationally lightweight strategy is able to detect both simulated streams and the known stream GD-1 in data. Originally designed for high-energy collider physics, this technique may have broad applicability within astrophysics as well as other domains interested in identifying localized anomalies.

HEP-PHDec 7, 2021
Machine Learning in the Search for New Fundamental Physics

Georgia Karagiorgi, Gregor Kasieczka, Scott Kravitz et al.

Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fields, the deep learning revolution (early 2010s) has yielded a qualitative shift in terms of the scope and ambition of research. These modern machine learning developments are the focus of the present review.

LGNov 27, 2021
Learning from learning machines: a new generation of AI technology to meet the needs of science

Luca Pion-Tonachini, Kristofer Bouchard, Hector Garcia Martin et al.

We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and data-driven AI learning machines, then we expect that these AI models can transform hypothesis generation, scientific discovery, and the scientific process itself.

LGNov 11, 2021
Online-compatible Unsupervised Non-resonant Anomaly Detection

Vinicius Mikuni, Benjamin Nachman, David Shih

There is a growing need for anomaly detection methods that can broaden the search for new particles in a model-agnostic manner. Most proposals for new methods focus exclusively on signal sensitivity. However, it is not enough to select anomalous events - there must also be a strategy to provide context to the selected events. We propose the first complete strategy for unsupervised detection of non-resonant anomalies that includes both signal sensitivity and a data-driven method for background estimation. Our technique is built out of two simultaneously-trained autoencoders that are forced to be decorrelated from each other. This method can be deployed offline for non-resonant anomaly detection and is also the first complete online-compatible anomaly detection strategy. We show that our method achieves excellent performance on a variety of signals prepared for the ADC2021 data challenge.

HEP-PHSep 16, 2021
A Cautionary Tale of Decorrelating Theory Uncertainties

Aishik Ghosh, Benjamin Nachman

A variety of techniques have been proposed to train machine learning classifiers that are independent of a given feature. While this can be an essential technique for enabling background estimation, it may also be useful for reducing uncertainties. We carefully examine theory uncertainties, which typically do not have a statistical origin. We will provide explicit examples of two-point (fragmentation modeling) and continuous (higher-order corrections) uncertainties where decorrelating significantly reduces the apparent uncertainty while the actual uncertainty is much larger. These results suggest that caution should be taken when using decorrelation for these types of uncertainties as long as we do not have a complete decomposition into statistically meaningful components.

MLJul 6, 2021
New Methods and Datasets for Group Anomaly Detection From Fundamental Physics

Gregor Kasieczka, Benjamin Nachman, David Shih

The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we first explain how, after the Nobel-prize winning discovery of the Higgs boson, unsupervised group anomaly detection has become a new frontier of fundamental physics (where the motivation is to find new particles and forces). Then we propose a realistic synthetic benchmark dataset (LHCO2020) for the development of group anomaly detection algorithms. Finally, we compare several existing statistically-sound techniques for unsupervised group anomaly detection, and demonstrate their performance on the LHCO2020 dataset.

MLJun 1, 2021
Latent Space Refinement for Deep Generative Models

Ramon Winterhalder, Marco Bellagente, Benjamin Nachman

Deep generative models are becoming widely used across science and industry for a variety of purposes. A common challenge is achieving a precise implicit or explicit representation of the data probability density. Recent proposals have suggested using classifier weights to refine the learned density of deep generative models. We extend this idea to all types of generative models and show how latent space refinement via iterated generative modeling can circumvent topological obstructions and improve precision. This methodology also applies to cases were the target model is non-differentiable and has many internal latent dimensions which must be marginalized over before refinement. We demonstrate our Latent Space Refinement (LaSeR) protocol on a variety of examples, focusing on the combinations of Normalizing Flows and Generative Adversarial Networks.

MLMay 10, 2021
Scaffolding Simulations with Deep Learning for High-dimensional Deconvolution

Anders Andreassen, Patrick T. Komiske, Eric M. Metodiev et al.

A common setting for scientific inference is the ability to sample from a high-fidelity forward model (simulation) without having an explicit probability density of the data. We propose a simulation-based maximum likelihood deconvolution approach in this setting called OmniFold. Deep learning enables this approach to be naturally unbinned and (variable-, and) high-dimensional. In contrast to model parameter estimation, the goal of deconvolution is to remove detector distortions in order to enable a variety of down-stream inference tasks. Our approach is the deep learning generalization of the common Richardson-Lucy approach that is also called Iterative Bayesian Unfolding in particle physics. We show how OmniFold can not only remove detector distortions, but it can also account for noise processes and acceptance effects.

HEP-PHApr 5, 2021
Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection

Jack H. Collins, Pablo Martín-Ramiro, Benjamin Nachman et al.

Anomaly detection techniques are growing in importance at the Large Hadron Collider (LHC), motivated by the increasing need to search for new physics in a model-agnostic way. In this work, we provide a detailed comparative study between a well-studied unsupervised method called the autoencoder (AE) and a weakly-supervised approach based on the Classification Without Labels (CWoLa) technique. We examine the ability of the two methods to identify a new physics signal at different cross sections in a fully hadronic resonance search. By construction, the AE classification performance is independent of the amount of injected signal. In contrast, the CWoLa performance improves with increasing signal abundance. When integrating these approaches with a complete background estimate, we find that the two methods have complementary sensitivity. In particular, CWoLa is effective at finding diverse and moderately rare signals while the AE can provide sensitivity to very rare signals, but only with certain topologies. We therefore demonstrate that both techniques are complementary and can be used together for anomaly detection at the LHC.

HEP-PHFeb 2, 2021
A Living Review of Machine Learning for Particle Physics

Matthew Feickert, Benjamin Nachman

Modern machine learning techniques, including deep learning, are rapidly being applied, adapted, and developed for high energy physics. Given the fast pace of this research, we have created a living review with the goal of providing a nearly comprehensive list of citations for those developing and applying these approaches to experimental, phenomenological, or theoretical analyses. As a living document, it will be updated as often as possible to incorporate the latest developments. A list of proper (unchanging) reviews can be found within. Papers are grouped into a small set of topics to be as useful as possible. Suggestions and contributions are most welcome, and we provide instructions for participating.

DATA-ANJan 18, 2021
E Pluribus Unum Ex Machina: Learning from Many Collider Events at Once

Benjamin Nachman, Jesse Thaler

There have been a number of recent proposals to enhance the performance of machine learning strategies for collider physics by combining many distinct events into a single ensemble feature. To evaluate the efficacy of these proposals, we study the connection between single-event classifiers and multi-event classifiers under the assumption that collider events are independent and identically distributed (IID). We show how one can build optimal multi-event classifiers from single-event classifiers, and we also show how to construct multi-event classifiers such that they produce optimal single-event classifiers. This is illustrated for a Gaussian example as well as for classification tasks relevant for searches and measurements at the Large Hadron Collider. We extend our discussion to regression tasks by showing how they can be phrased in terms of parametrized classifiers. Empirically, we find that training a single-event (per-instance) classifier is more effective than training a multi-event (per-ensemble) classifier, as least for the cases we studied, and we relate this fact to properties of the loss function gradient in the two cases. While we did not identify a clear benefit from using multi-event classifiers in the collider context, we speculate on the potential value of these methods in cases involving only approximate independence, as relevant for jet substructure studies.

INS-DETDec 8, 2020
Beyond 4D Tracking: Using Cluster Shapes for Track Seeding

Patrick J. Fox, Shangqing Huang, Joshua Isaacson et al.

Tracking is one of the most time consuming aspects of event reconstruction at the Large Hadron Collider (LHC) and its high-luminosity upgrade (HL-LHC). Innovative detector technologies extend tracking to four-dimensions by including timing in the pattern recognition and parameter estimation. However, present and future hardware already have additional information that is largely unused by existing track seeding algorithms. The shape of clusters provides an additional dimension for track seeding that can significantly reduce the combinatorial challenge of track finding. We use neural networks to show that cluster shapes can reduce significantly the rate of fake combinatorical backgrounds while preserving a high efficiency. We demonstrate this using the information in cluster singlets, doublets and triplets. Numerical results are presented with simulations from the TrackML challenge.

HEP-PHOct 7, 2020
Parameter Estimation using Neural Networks in the Presence of Detector Effects

Anders Andreassen, Shih-Chieh Hsu, Benjamin Nachman et al.

Histogram-based template fits are the main technique used for estimating parameters of high energy physics Monte Carlo generators. Parametrized neural network reweighting can be used to extend this fitting procedure to many dimensions and does not require binning. If the fit is to be performed using reconstructed data, then expensive detector simulations must be used for training the neural networks. We introduce a new two-level fitting approach that only requires one dataset with detector simulation and then a set of additional generation-level datasets without detector effects included. This Simulation-level fit based on Reweighting Generator-level events with Neural networks (SRGN) is demonstrated using simulated datasets for a variety of examples including a simple Gaussian random variable, parton shower tuning, and the top quark mass extraction.

HEP-PHSep 4, 2020
Simulation-Assisted Decorrelation for Resonant Anomaly Detection

Kees Benkendorfer, Luc Le Pottier, Benjamin Nachman

A growing number of weak- and unsupervised machine learning approaches to anomaly detection are being proposed to significantly extend the search program at the Large Hadron Collider and elsewhere. One of the prototypical examples for these methods is the search for resonant new physics, where a bump hunt can be performed in an invariant mass spectrum. A significant challenge to methods that rely entirely on data is that they are susceptible to sculpting artificial bumps from the dependence of the machine learning classifier on the invariant mass. We explore two solutions to this challenge by minimally incorporating simulation into the learning. In particular, we study the robustness of Simulation Assisted Likelihood-free Anomaly Detection (SALAD) to correlations between the classifier and the invariant mass. Next, we propose a new approach that only uses the simulation for decorrelation but the Classification without Labels (CWoLa) approach for achieving signal sensitivity. Both methods are compared using a full background fit analysis on simulated data from the LHC Olympics and are robust to correlations in the data.

HEP-PHSep 3, 2020
DCTRGAN: Improving the Precision of Generative Models with Reweighting

Sascha Diefenbacher, Engin Eren, Gregor Kasieczka et al.

Significant advances in deep learning have led to more widely used and precise neural network-based generative models such as Generative Adversarial Networks (GANs). We introduce a post-hoc correction to deep generative models to further improve their fidelity, based on the Deep neural networks using the Classification for Tuning and Reweighting (DCTR) protocol. The correction takes the form of a reweighting function that can be applied to generated examples when making predictions from the simulation. We illustrate this approach using GANs trained on standard multimodal probability densities as well as calorimeter simulations from high energy physics. We show that the weighted GAN examples significantly improve the accuracy of the generated samples without a large loss in statistical power. This approach could be applied to any generative model and is a promising refinement method for high energy physics applications and beyond.

HEP-PHAug 14, 2020
GANplifying Event Samples

Anja Butter, Sascha Diefenbacher, Gregor Kasieczka et al.

A critical question concerning generative networks applied to event generation in particle physics is if the generated events add statistical precision beyond the training sample. We show for a simple example with increasing dimensionality how generative networks indeed amplify the training statistics. We quantify their impact through an amplification factor or equivalent numbers of sampled events.

HEP-PHJan 14, 2020
Simulation Assisted Likelihood-free Anomaly Detection

Anders Andreassen, Benjamin Nachman, David Shih

Given the lack of evidence for new particle discoveries at the Large Hadron Collider (LHC), it is critical to broaden the search program. A variety of model-independent searches have been proposed, adding sensitivity to unexpected signals. There are generally two types of such searches: those that rely heavily on simulations and those that are entirely based on (unlabeled) data. This paper introduces a hybrid method that makes the best of both approaches. For potential signals that are resonant in one known feature, this new method first learns a parameterized reweighting function to morph a given simulation to match the data in sidebands. This function is then interpolated into the signal region and then the reweighted background-only simulation can be used for supervised learning as well as for background estimation. The background estimation from the reweighted simulation allows for non-trivial correlations between features used for classification and the resonant feature. A dijet search with jet substructure is used to illustrate the new method. Future applications of Simulation Assisted Likelihood-free Anomaly Detection (SALAD) include a variety of final states and potential combinations with other model-independent approaches.

HEP-PHJan 14, 2020
Anomaly Detection with Density Estimation

Benjamin Nachman, David Shih

We leverage recent breakthroughs in neural density estimation to propose a new unsupervised anomaly detection technique (ANODE). By estimating the probability density of the data in a signal region and in sidebands, and interpolating the latter into the signal region, a likelihood ratio of data vs. background can be constructed. This likelihood ratio is broadly sensitive to overdensities in the data that could be due to localized anomalies. In addition, a unique potential benefit of the ANODE method is that the background can be directly estimated using the learned densities. Finally, ANODE is robust against systematic differences between signal region and sidebands, giving it broader applicability than other methods. We demonstrate the power of this new approach using the LHC Olympics 2020 R\&D Dataset. We show how ANODE can enhance the significance of a dijet bump hunt by up to a factor of 7 with a 10\% accuracy on the background prediction. While the LHC is used as the recurring example, the methods developed here have a much broader applicability to anomaly detection in physics and beyond.

HEP-PHNov 20, 2019
OmniFold: A Method to Simultaneously Unfold All Observables

Anders Andreassen, Patrick T. Komiske, Eric M. Metodiev et al.

Collider data must be corrected for detector effects ("unfolded") to be compared with many theoretical calculations and measurements from other experiments. Unfolding is traditionally done for individual, binned observables without including all information relevant for characterizing the detector response. We introduce OmniFold, an unfolding method that iteratively reweights a simulated dataset, using machine learning to capitalize on all available information. Our approach is unbinned, works for arbitrarily high-dimensional data, and naturally incorporates information from the full phase space. We illustrate this technique on a realistic jet substructure example from the Large Hadron Collider and compare it to standard binned unfolding methods. This new paradigm enables the simultaneous measurement of all observables, including those not yet invented at the time of the analysis.

HEP-PHOct 18, 2019
AI Safety for High Energy Physics

Benjamin Nachman, Chase Shimmin

The field of high-energy physics (HEP), along with many scientific disciplines, is currently experiencing a dramatic influx of new methodologies powered by modern machine learning techniques. Over the last few years, a growing body of HEP literature has focused on identifying promising applications of deep learning in particular, and more recently these techniques are starting to be realized in an increasing number of experimental measurements. The overall conclusion from this impressive and extensive set of studies is that rarer and more complex physics signatures can be identified with the new set of powerful tools from deep learning. However, there is an unstudied systematic risk associated with combining the traditional HEP workflow and deep learning with high-dimensional data. In particular, calibrating and validating the response of deep neural networks is in general not experimentally feasible, and therefore current methods may be biased in ways that are not covered by current uncertainty estimates. By borrowing ideas from AI safety, we illustrate these potential issues and propose a method to bound the size of unaccounted for uncertainty. In addition to providing a pragmatic diagnostic, this work will hopefully begin a dialogue within the community about the robust application of deep learning to experimental analyses.

HEP-PHJul 18, 2019
Neural Networks for Full Phase-space Reweighting and Parameter Tuning

Anders Andreassen, Benjamin Nachman

Precise scientific analysis in collider-based particle physics is possible because of complex simulations that connect fundamental theories to observable quantities. The significant computational cost of these programs limits the scope, precision, and accuracy of Standard Model measurements and searches for new phenomena. We therefore introduce Deep neural networks using Classification for Tuning and Reweighting (DCTR), a neural network-based approach to reweight and fit simulations using all kinematic and flavor information -- the full phase space. DCTR can perform tasks that are currently not possible with existing methods, such as estimating non-perturbative fragmentation uncertainties. The core idea behind the new approach is to exploit powerful high-dimensional classifiers to reweight phase space as well as to identify the best parameters for describing data. Numerical examples from $e^+e^-\rightarrow\text{jets}$ demonstrate the fidelity of these methods for simulation parameters that have a big and broad impact on phase space as well as those that have a minimal and/or localized impact. The high fidelity of the full phase-space reweighting enables a new paradigm for simulations, parameter tuning, and model systematic uncertainties across particle physics and possibly beyond.

HEP-PHJan 30, 2018
Learning to Classify from Impure Samples with High-Dimensional Data

Patrick T. Komiske, Eric M. Metodiev, Benjamin Nachman et al.

A persistent challenge in practical classification tasks is that labeled training sets are not always available. In particle physics, this challenge is surmounted by the use of simulations. These simulations accurately reproduce most features of data, but cannot be trusted to capture all of the complex correlations exploitable by modern machine learning methods. Recent work in weakly supervised learning has shown that simple, low-dimensional classifiers can be trained using only the impure mixtures present in data. Here, we demonstrate that complex, high-dimensional classifiers can also be trained on impure mixtures using weak supervision techniques, with performance comparable to what could be achieved with pure samples. Using weak supervision will therefore allow us to avoid relying exclusively on simulations for high-dimensional classification. This work opens the door to a new regime whereby complex models are trained directly on data, providing direct access to probe the underlying physics.

HEP-EXDec 21, 2017
CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks

Michela Paganini, Luke de Oliveira, Benjamin Nachman

The precise modeling of subatomic particle interactions and propagation through matter is paramount for the advancement of nuclear and particle physics searches and precision measurements. The most computationally expensive step in the simulation pipeline of a typical experiment at the Large Hadron Collider (LHC) is the detailed modeling of the full complexity of physics processes that govern the motion and evolution of particle showers inside calorimeters. We introduce \textsc{CaloGAN}, a new fast simulation technique based on generative adversarial networks (GANs). We apply these neural networks to the modeling of electromagnetic showers in a longitudinally segmented calorimeter, and achieve speedup factors comparable to or better than existing full simulation techniques on CPU ($100\times$-$1000\times$) and even faster on GPU (up to $\sim10^5\times$). There are still challenges for achieving precision across the entire phase space, but our solution can reproduce a variety of geometric shower shape properties of photons, positrons and charged pions. This represents a significant stepping stone toward a full neural network-based detector simulation that could save significant computing time and enable many analyses now and in the future.

HEP-EXNov 23, 2017
Controlling Physical Attributes in GAN-Accelerated Simulation of Electromagnetic Calorimeters

Luke de Oliveira, Michela Paganini, Benjamin Nachman

High-precision modeling of subatomic particle interactions is critical for many fields within the physical sciences, such as nuclear physics and high energy particle physics. Most simulation pipelines in the sciences are computationally intensive -- in a variety of scientific fields, Generative Adversarial Networks have been suggested as a solution to speed up the forward component of simulation, with promising results. An important component of any simulation system for the sciences is the ability to condition on any number of physically meaningful latent characteristics that can effect the forward generation procedure. We introduce an auxiliary task to the training of a Generative Adversarial Network on particle showers in a multi-layer electromagnetic calorimeter, which allows our model to learn an attribute-aware conditioning mechanism.

HEP-PHAug 9, 2017
Classification without labels: Learning from mixed samples in high energy physics

Eric M. Metodiev, Benjamin Nachman, Jesse Thaler

Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Crucially, neither individual labels nor class proportions are required, yet we prove that the optimal classifier in the CWoLa paradigm is also the optimal classifier in the traditional fully-supervised case where all label information is available. After demonstrating the power of this method in an analytical toy example, we consider a realistic benchmark for collider physics: distinguishing quark- versus gluon-initiated jets using mixed quark/gluon training samples. More generally, CWoLa can be applied to any classification problem where labels or class proportions are unknown or simulations are unreliable, but statistical mixtures of the classes are available.

HEP-PHJul 26, 2017
Pileup Mitigation with Machine Learning (PUMML)

Patrick T. Komiske, Eric M. Metodiev, Benjamin Nachman et al.

Pileup involves the contamination of the energy distribution arising from the primary collision of interest (leading vertex) by radiation from soft collisions (pileup). We develop a new technique for removing this contamination using machine learning and convolutional neural networks. The network takes as input the energy distribution of charged leading vertex particles, charged pileup particles, and all neutral particles and outputs the energy distribution of particles coming from leading vertex alone. The PUMML algorithm performs remarkably well at eliminating pileup distortion on a wide range of simple and complex jet observables. We test the robustness of the algorithm in a number of ways and discuss how the network can be trained directly on data.

HEP-EXMay 5, 2017
Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multi-Layer Calorimeters

Michela Paganini, Luke de Oliveira, Benjamin Nachman

Physicists at the Large Hadron Collider (LHC) rely on detailed simulations of particle collisions to build expectations of what experimental data may look like under different theory modeling assumptions. Petabytes of simulated data are needed to develop analysis techniques, though they are expensive to generate using existing algorithms and computing resources. The modeling of detectors and the precise description of particle cascades as they interact with the material in the calorimeter are the most computationally demanding steps in the simulation pipeline. We therefore introduce a deep neural network-based generative model to enable high-fidelity, fast, electromagnetic calorimeter simulation. There are still challenges for achieving precision across the entire phase space, but our current solution can reproduce a variety of particle shower properties while achieving speed-up factors of up to 100,000$\times$. This opens the door to a new era of fast simulation that could save significant computing time and disk space, while extending the reach of physics searches and precision measurements at the LHC and beyond.

HEP-PHFeb 1, 2017
Weakly Supervised Classification in High Energy Physics

Lucio Mwinmaarong Dery, Benjamin Nachman, Francesco Rubbo et al.

As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics - quark versus gluon tagging - we show that weakly supervised classification can match the performance of fully supervised algorithms. Furthermore, by design, the new algorithm is insensitive to any mis-modeling of discriminating features in the data by the simulation. Weakly supervised classification is a general procedure that can be applied to a wide variety of learning problems to boost performance and robustness when detailed simulations are not reliable or not available.

MLJan 20, 2017
Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis

Luke de Oliveira, Michela Paganini, Benjamin Nachman

We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in High Energy Particle Physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images -- 2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i.e., jet mass, n-subjettiness, etc.). We shed light on limitations, and provide a novel empirical validation of image quality and validity of GAN-produced simulations of the natural world. This work provides a base for further explorations of GANs for use in faster simulation in High Energy Particle Physics.