HEP-PHMay 25
A universal vision transformer for fast calorimeter simulationsLuigi Favaro, Andrea Giammanco, Claudius Krause
The high-dimensional complex nature of detectors makes fast calorimeter simulations a prime application for modern generative machine learning. Vision transformers (ViTs) can emulate the Geant4 response with unmatched accuracy and are not limited to regular geometries. Starting from the CaloDREAM architecture, we demonstrate the robustness and scalability of ViTs on regular and irregular geometries, and multiple detectors. Our results show that ViTs generate electromagnetic and hadronic showers with minimal deviations from Geant4 in multiple evaluation metrics, while maintaining the generation time in the $\mathcal{O}(10-100)$ ms on a single GPU. Furthermore, we show that pretraining on a large dataset and fine-tuning on the target geometry leads to reduced training costs and higher data efficiency, or altogether improves the fidelity of generated showers.
HEP-PHMar 15, 2022
New directions for surrogate models and differentiable programming for High Energy Physics detector simulationAndreas Adelmann, Walter Hopkins, Evangelos Kourlitis et al.
The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, providing controllable and scalable simulation routines. In this document, new and ongoing efforts for surrogate models and differential programming applied to detector simulation are discussed in the context of the 2021 Particle Physics Community Planning Exercise (`Snowmass').
INS-DETOct 25, 2022
CaloFlow for CaloChallenge Dataset 1Claudius Krause, Ian Pang, David Shih
CaloFlow is a new and promising approach to fast calorimeter simulation based on normalizing flows. Applying CaloFlow to the photon and charged pion Geant4 showers of Dataset 1 of the Fast Calorimeter Simulation Challenge 2022, we show how it can produce high-fidelity samples with a sampling time that is several orders of magnitude faster than Geant4. We demonstrate the fidelity of the samples using calorimeter shower images, histograms of high-level features, and aggregate metrics such as a classifier trained to distinguish CaloFlow from Geant4 samples.
INS-DETOct 28, 2024
CaloChallenge 2022: A Community Challenge for Fast Calorimeter SimulationClaudius Krause, Michele Faucci Giannelli, Gregor Kasieczka et al.
We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoders (VAEs), Generative Adversarial Networks (GANs), Normalizing Flows, Diffusion models, and models based on Conditional Flow Matching. We compare all submissions in terms of quality of generated calorimeter showers, as well as shower generation time and model size. To assess the quality we use a broad range of different metrics including differences in 1-dimensional histograms of observables, KPD/FPD scores, AUCs of binary classifiers, and the log-posterior of a multiclass classifier. The results of the CaloChallenge provide the most complete and comprehensive survey of cutting-edge approaches to calorimeter fast simulation to date. In addition, our work provides a uniquely detailed perspective on the important problem of how to evaluate generative models. As such, the results presented here should be applicable for other domains that use generative AI and require fast and faithful generation of samples in a large phase space.
INS-DETDec 15, 2023
Deep Generative Models for Detector Signature Simulation: A Taxonomic ReviewBaran Hashemi, Claudius Krause
In modern collider experiments, the quest to explore fundamental interactions between elementary particles has reached unparalleled levels of precision. Signatures from particle physics detectors are low-level objects (such as energy depositions or tracks) encoding the physics of collisions (the final state particles of hard scattering interactions). The complete simulation of them in a detector is a computational and storage-intensive task. To address this computational bottleneck in particle physics, alternative approaches have been developed, introducing additional assumptions and trade off accuracy for speed.The field has seen a surge in interest in surrogate modeling the detector simulation, fueled by the advancements in deep generative models. These models aim to generate responses that are statistically identical to the observed data. In this paper, we conduct a comprehensive and exhaustive taxonomic review of the existing literature on the simulation of detector signatures from both methodological and application-wise perspectives. Initially, we formulate the problem of detector signature simulation and discuss its different variations that can be unified. Next, we classify the state-of-the-art methods into five distinct categories based on their underlying model architectures, summarizing their respective generation strategies. Finally, we shed light on the challenges and opportunities that lie ahead in detector signature simulation, setting the stage for future research and development.
HEP-PHApr 4, 2025
BitHEP -- The Limits of Low-Precision ML in HEPClaudius Krause, Daohan Wang, Ramon Winterhalder
The increasing complexity of modern neural network architectures demands fast and memory-efficient implementations to mitigate computational bottlenecks. In this work, we evaluate the recently proposed BitNet architecture in HEP applications, assessing its performance in classification, regression, and generative modeling tasks. Specifically, we investigate its suitability for quark-gluon discrimination, SMEFT parameter estimation, and detector simulation, comparing its efficiency and accuracy to state-of-the-art methods. Our results show that while BitNet consistently performs competitively in classification tasks, its performance in regression and generation varies with the size and type of the network, highlighting key limitations and potential areas for improvement.
INS-DETMay 19, 2023
Inductive Simulation of Calorimeter Showers with Normalizing FlowsMatthew R. Buckley, Claudius Krause, Ian Pang et al.
Simulating particle detector response is the single most expensive step in the Large Hadron Collider computational pipeline. Recently it was shown that normalizing flows can accelerate this process while achieving unprecedented levels of accuracy, but scaling this approach up to higher resolutions relevant for future detector upgrades leads to prohibitive memory constraints. To overcome this problem, we introduce Inductive CaloFlow (iCaloFlow), a framework for fast detector simulation based on an inductive series of normalizing flows trained on the pattern of energy depositions in pairs of consecutive calorimeter layers. We further use a teacher-student distillation to increase sampling speed without loss of expressivity. As we demonstrate with Datasets 2 and 3 of the CaloChallenge2022, iCaloFlow can realize the potential of normalizing flows in performing fast, high-fidelity simulation on detector geometries that are ~ 10 - 100 times higher granularity than previously considered.
INS-DETOct 21, 2021
CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing FlowsClaudius Krause, David Shih
Recently, we introduced CaloFlow, a high-fidelity generative model for GEANT4 calorimeter shower emulation based on normalizing flows. Here, we present CaloFlow v2, an improvement on our original framework that speeds up shower generation by a further factor of 500 relative to the original. The improvement is based on a technique called Probability Density Distillation, originally developed for speech synthesis in the ML literature, and which we develop further by introducing a set of powerful new loss terms. We demonstrate that CaloFlow v2 preserves the same high fidelity of the original using qualitative (average images, histograms of high level features) and quantitative (classifier metric between GEANT4 and generated samples) measures. The result is a generative model for calorimeter showers that matches the state-of-the-art in speed (a factor of $10^4$ faster than GEANT4) and greatly surpasses the previous state-of-the-art in fidelity.
INS-DETJun 9, 2021
CaloFlow: Fast and Accurate Generation of Calorimeter Showers with Normalizing FlowsClaudius Krause, David Shih
We introduce CaloFlow, a fast detector simulation framework based on normalizing flows. For the first time, we demonstrate that normalizing flows can reproduce many-channel calorimeter showers with extremely high fidelity, providing a fresh alternative to computationally expensive GEANT4 simulations, as well as other state-of-the-art fast simulation frameworks based on GANs and VAEs. Besides the usual histograms of physical features and images of calorimeter showers, we introduce a new metric for judging the quality of generative modeling: the performance of a classifier trained to differentiate real from generated images. We show that GAN-generated images can be identified by the classifier with nearly 100% accuracy, while images generated from CaloFlow are better able to fool the classifier. More broadly, normalizing flows offer several advantages compared to other state-of-the-art approaches (GANs and VAEs), including: tractable likelihoods; stable and convergent training; and principled model selection. Normalizing flows also provide a bijective mapping between data and the latent space, which could have other applications beyond simulation, for example, to detector unfolding.
COMP-PHJan 15, 2020
i-flow: High-dimensional Integration and Sampling with Normalizing FlowsChristina Gao, Joshua Isaacson, Claudius Krause
In many fields of science, high-dimensional integration is required. Numerical methods have been developed to evaluate these complex integrals. We introduce the code i-flow, a python package that performs high-dimensional numerical integration utilizing normalizing flows. Normalizing flows are machine-learned, bijective mappings between two distributions. i-flow can also be used to sample random points according to complicated distributions in high dimensions. We compare i-flow to other algorithms for high-dimensional numerical integration and show that i-flow outperforms them for high dimensional correlated integrals. The i-flow code is publicly available on gitlab at https://gitlab.com/i-flow/i-flow.