HEP-EXAug 17, 2023
SR-GAN for SR-gamma: super resolution of photon calorimeter images at collider experimentsJohannes Erdmann, Aaron van der Graaf, Florian Mausolf et al.
We study single-image super-resolution algorithms for photons at collider experiments based on generative adversarial networks. We treat the energy depositions of simulated electromagnetic showers of photons and neutral-pion decays in a toy electromagnetic calorimeter as 2D images and we train super-resolution networks to generate images with an artificially increased resolution by a factor of four in each dimension. The generated images are able to reproduce features of the electromagnetic showers that are not obvious from the images at nominal resolution. Using the artificially-enhanced images for the reconstruction of shower-shape variables and of the position of the shower center results in significant improvements. We additionally investigate the utilization of the generated images as a pre-processing step for deep-learning photon-identification algorithms and observe improvements in the case of training samples of small size.
HEP-PHAug 5, 2024
KAN we improve on HEP classification tasks? Kolmogorov-Arnold Networks applied to an LHC physics exampleJohannes Erdmann, Florian Mausolf, Jan Lukas Späh
Recently, Kolmogorov-Arnold Networks (KANs) have been proposed as an alternative to multilayer perceptrons, suggesting advantages in performance and interpretability. We study a typical binary event classification task in high-energy physics including high-level features and comment on the performance and interpretability of KANs in this context. Consistent with expectations, we find that the learned activation functions of a one-layer KAN resemble the univariate log-likelihood ratios of the respective input features. In deeper KANs, the activations in the first layer differ from those in the one-layer KAN, which indicates that the deeper KANs learn more complex representations of the data, a pattern commonly observed in other deep-learning architectures. We study KANs with different depths and widths and we compare them to multilayer perceptrons in terms of performance and number of trainable parameters. For the chosen classification task, we do not find that KANs are more parameter efficient. However, small KANs may offer advantages in terms of interpretability that come at the cost of only a moderate loss in performance.
DATA-ANJan 12
Learning to bin: differentiable and Bayesian optimization for multi-dimensional discriminants in high-energy physicsJohannes Erdmann, Nitish Kumar Kasaraguppe, Florian Mausolf
Categorizing events using discriminant observables is central to many high-energy physics analyses. Yet, bin boundaries are often chosen by hand. A simple, popular choice is to apply argmax projections of multi-class scores and equidistant binning of one-dimensional discriminants. We propose a binning optimization for signal significance directly in multi-dimensional discriminants. We use a Gaussian Mixture Model (GMM) to define flexible bin boundary shapes for multi-class scores, while in one dimension (binary classification) we move bin boundaries directly. On this binning model, we study two optimization strategies: a differentiable and a Bayesian optimization approach. We study two toy setups: a binary classification and a three-class problem with two signals and backgrounds. In the one-dimensional case, both approaches achieve similar gains in signal sensitivity compared to equidistant binnings for a given number of bins. In the multi-dimensional case, the GMM-based binning defines sensitive categories as well, with the differentiable approach performing best. We show that, in particular for limited separability of the signal processes, our approach outperforms argmax classification even with optimized binning in the one-dimensional projections. Both methods are released as lightweight Python plugins intended for straightforward integration into existing analyses.
41.7HCApr 13
Enabling users to work sustainably on shared institute computing resourcesNiclas Eich, Johannes Erdmann, Martin Erdmann et al.
The VISPA project is a self-managed, mid-scale computing cluster that supports physics data analysis in research and teaching. Because the cluster is housed in a 1970s institute building with limited retrofit options, conventional efficiency upgrades would yield only minor energy savings. We therefore target sustainability primarily through user-centric measures. A monitoring system now records per-job energy consumption, while real-time data on the renewable share of the German power grid enable `green-window' scheduling. Users can query their individual energy consumption and carbon footprints, receive weekly reports, and tag jobs by project for aggregate accounting; memory records from previous runs help avoid oversubscription. All options are voluntary, fostering a cultural shift rather than imposing hard constraints. A simulation framework evaluates the potential impact of these measures. Together, the technological and behavioral interventions aim at medium- to long-term reductions in greenhouse-gas emissions by increasing resource awareness within the scientific community.
HEP-PHMar 27, 2024
One flow to correct them all: improving simulations in high-energy physics with a single normalising flow and a switchCaio Cesar Daumann, Mauro Donega, Johannes Erdmann et al.
Simulated events are key ingredients in almost all high-energy physics analyses. However, imperfections in the simulation can lead to sizeable differences between the observed data and simulated events. The effects of such mismodelling on relevant observables must be corrected either effectively via scale factors, with weights or by modifying the distributions of the observables and their correlations. We introduce a correction method that transforms one multidimensional distribution (simulation) into another one (data) using a simple architecture based on a single normalising flow with a boolean condition. We demonstrate the effectiveness of the method on a physics-inspired toy dataset with non-trivial mismodelling of several observables and their correlations.