LGJul 4, 2024
SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation FunctionsEric A. F. Reinhardt, P. R. Dinesh, Sergei Gleyzer
Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions (SineKAN). We evaluate numerical performance of our model on a benchmark vision task. We show that our model can perform better than or comparable to B-Spline KAN models and an alternative KAN implementation based on periodic cosine and sine functions representing a Fourier Series. Further, we show that SineKAN has numerical accuracy that could scale comparably to dense neural networks (DNNs). Compared to the two baseline KAN models, SineKAN achieves a substantial speed increase at all hidden layer sizes, batch sizes, and depths. Current advantage of DNNs due to hardware and software optimizations are discussed along with theoretical scaling. Additionally, properties of SineKAN compared to other KAN implementations and current limitations are also discussed
HEP-PHJun 17, 2022
SYMBA: Symbolic Computation of Squared Amplitudes in High Energy Physics with Machine LearningAbdulhakim Alnuqaydan, Sergei Gleyzer, Harrison Prosper
The cross section is one of the most important physical quantities in high-energy physics and the most time consuming to compute. While machine learning has proven to be highly successful in numerical calculations in high-energy physics, analytical calculations using machine learning are still in their infancy. In this work, we use a sequence-to-sequence model, specifically, a transformer, to compute a key element of the cross section calculation, namely, the squared amplitude of an interaction. We show that a transformer model is able to predict correctly 97.6% and 99% of squared amplitudes of QCD and QED processes, respectively, at a speed that is up to orders of magnitude faster than current symbolic computation frameworks. We discuss the performance of the current model, its limitations and possible future directions for this work.
QUANT-PHNov 30, 2023
A Comparison Between Invariant and Equivariant Classical and Quantum Graph Neural NetworksRoy T. Forestano, Marçal Comajoan Cara, Gopal Ramesh Dahale et al.
Machine learning algorithms are heavily relied on to understand the vast amounts of data from high-energy particle collisions at the CERN Large Hadron Collider (LHC). The data from such collision events can naturally be represented with graph structures. Therefore, deep geometric methods, such as graph neural networks (GNNs), have been leveraged for various data analysis tasks in high-energy physics. One typical task is jet tagging, where jets are viewed as point clouds with distinct features and edge connections between their constituent particles. The increasing size and complexity of the LHC particle datasets, as well as the computational models used for their analysis, greatly motivate the development of alternative fast and efficient computational paradigms such as quantum computation. In addition, to enhance the validity and robustness of deep networks, one can leverage the fundamental symmetries present in the data through the use of invariant inputs and equivariant layers. In this paper, we perform a fair and comprehensive comparison between classical graph neural networks (GNNs) and equivariant graph neural networks (EGNNs) and their quantum counterparts: quantum graph neural networks (QGNNs) and equivariant quantum graph neural networks (EQGNN). The four architectures were benchmarked on a binary classification task to classify the parton-level particle initiating the jet. Based on their AUC scores, the quantum networks were shown to outperform the classical networks. However, seeing the computational advantage of the quantum networks in practice may have to wait for the further development of quantum technology and its associated APIs.
QUANT-PHNov 30, 2023
$\mathbb{Z}_2\times \mathbb{Z}_2$ Equivariant Quantum Neural Networks: Benchmarking against Classical Neural NetworksZhongtian Dong, Marçal Comajoan Cara, Gopal Ramesh Dahale et al.
This paper presents a comprehensive comparative analysis of the performance of Equivariant Quantum Neural Networks (EQNN) and Quantum Neural Networks (QNN), juxtaposed against their classical counterparts: Equivariant Neural Networks (ENN) and Deep Neural Networks (DNN). We evaluate the performance of each network with two toy examples for a binary classification task, focusing on model complexity (measured by the number of parameters) and the size of the training data set. Our results show that the $\mathbb{Z}_2\times \mathbb{Z}_2$ EQNN and the QNN provide superior performance for smaller parameter sets and modest training data samples.
INS-DETJun 13, 2023
Deep Learning-Based Spatiotemporal Multi-Event Reconstruction for Delay Line DetectorsMarco Knipfer, Stefan Meier, Jonas Heimerl et al.
Accurate observation of two or more particles within a very narrow time window has always been a challenge in modern physics. It creates the possibility of correlation experiments, such as the ground-breaking Hanbury Brown-Twiss experiment, leading to new physical insights. For low-energy electrons, one possibility is to use a microchannel plate with subsequent delay lines for the readout of the incident particle hits, a setup called a Delay Line Detector. The spatial and temporal coordinates of more than one particle can be fully reconstructed outside a region called the dead radius. For interesting events, where two electrons are close in space and time, the determination of the individual positions of the electrons requires elaborate peak finding algorithms. While classical methods work well with single particle hits, they fail to identify and reconstruct events caused by multiple nearby particles. To address this challenge, we present a new spatiotemporal machine learning model to identify and reconstruct the position and time of such multi-hit particle signals. This model achieves a much better resolution for nearby particle hits compared to the classical approach, removing some of the artifacts and reducing the dead radius by half. We show that machine learning models can be effective in improving the spatiotemporal performance of delay line detectors.
EPNov 17, 2022
Locating Hidden Exoplanets in ALMA Data Using Machine LearningJason Terry, Cassandra Hall, Sean Abreau et al.
Exoplanets in protoplanetary disks cause localized deviations from Keplerian velocity in channel maps of molecular line emission. Current methods of characterizing these deviations are time consuming, and there is no unified standard approach. We demonstrate that machine learning can quickly and accurately detect the presence of planets. We train our model on synthetic images generated from simulations and apply it to real observations to identify forming planets in real systems. Machine learning methods, based on computer vision, are not only capable of correctly identifying the presence of one or more planets, but they can also correctly constrain the location of those planets.
QUANT-PHMar 26
The Pareto Frontiers of Magic and Entanglement: The Case of Two QubitsAlexander Roman, Marco Knipfer, Jogi Suda Neto et al.
Magic and entanglement are two measures that are widely used to characterize quantum resources. We study the interplay between magic and entanglement in two-qubit systems, focusing on the two extremes: maximal magic and minimal magic for a given level of entanglement. We quantify magic by the Rényi entropy of order 2, $M_2$, and entanglement by the concurrence $Î$. We find that the Pareto frontier of maximal magic $M_2^{(max)}(Î)$ is composed of three separate segments, while the boundary of minimal magic $M_2^{(min)}(Î)$ is a single continuous line. We derive simple analytical formulas for all these four cases, and explicitly parametrize all distinct quantum states of maximal or minimal magic at a given level of entanglement.
EPJul 8, 2024
A Machine Learning Approach to Detecting Albedo Anomalies on the Lunar SurfaceSofia Strukova, Sergei Gleyzer, Patrick Peplowski et al.
This study introduces a data-driven approach using machine learning (ML) techniques to explore and predict albedo anomalies on the Moon's surface. The research leverages diverse planetary datasets, including high-spatial-resolution albedo maps and element maps (LPFe, LPK, LPTh, LPTi) derived from laser and gamma-ray measurements. The primary objective is to identify relationships between chemical elements and albedo, thereby expanding our understanding of planetary surfaces and offering predictive capabilities for areas with incomplete datasets. To bridge the gap in resolution between the albedo and element maps, we employ Gaussian blurring techniques, including an innovative adaptive Gaussian blur. Our methodology culminates in the deployment of an Extreme Gradient Boosting Regression Model, optimized to predict full albedo based on elemental composition. Furthermore, we present an interactive analytical tool to visualize prediction errors, delineating their spatial and chemical characteristics. The findings not only pave the way for a more comprehensive understanding of the Moon's surface but also provide a framework for similar studies on other celestial bodies.
QUANT-PHFeb 1, 2024
Hybrid Quantum Vision Transformers for Event Classification in High Energy PhysicsEyup B. Unlu, Marçal Comajoan Cara, Gopal Ramesh Dahale et al.
Models based on vision transformer architectures are considered state-of-the-art when it comes to image classification tasks. However, they require extensive computational resources both for training and deployment. The problem is exacerbated as the amount and complexity of the data increases. Quantum-based vision transformer models could potentially alleviate this issue by reducing the training and operating time while maintaining the same predictive power. Although current quantum computers are not yet able to perform high-dimensional tasks yet, they do offer one of the most efficient solutions for the future. In this work, we construct several variations of a quantum hybrid vision transformer for a classification problem in high energy physics (distinguishing photons and electrons in the electromagnetic calorimeter). We test them against classical vision transformer architectures. Our findings indicate that the hybrid models can achieve comparable performance to their classical analogues with a similar number of parameters.
QUANT-PHMay 16, 2024
Quantum Vision Transformers for Quark-Gluon ClassificationMarçal Comajoan Cara, Gopal Ramesh Dahale, Zhongtian Dong et al.
We introduce a hybrid quantum-classical vision transformer architecture, notable for its integration of variational quantum circuits within both the attention mechanism and the multi-layer perceptrons. The research addresses the critical challenge of computational efficiency and resource constraints in analyzing data from the upcoming High Luminosity Large Hadron Collider, presenting the architecture as a potential solution. In particular, we evaluate our method by applying the model to multi-detector jet images from CMS Open Data. The goal is to distinguish quark-initiated from gluon-initiated jets. We successfully train the quantum model and evaluate it via numerical simulations. Using this approach, we achieve classification performance almost on par with the one obtained with the completely classical architecture, considering a similar number of parameters.
QUANT-PHNov 20, 2024
Quantum Attention for Vision Transformers in High Energy PhysicsAlessandro Tesi, Gopal Ramesh Dahale, Sergei Gleyzer et al.
We present a novel hybrid quantum-classical vision transformer architecture incorporating quantum orthogonal neural networks (QONNs) to enhance performance and computational efficiency in high-energy physics applications. Building on advancements in quantum vision transformers, our approach addresses limitations of prior models by leveraging the inherent advantages of QONNs, including stability and efficient parameterization in high-dimensional spaces. We evaluate the proposed architecture using multi-detector jet images from CMS Open Data, focusing on the task of distinguishing quark-initiated from gluon-initiated jets. The results indicate that embedding quantum orthogonal transformations within the attention mechanism can provide robust performance while offering promising scalability for machine learning challenges associated with the upcoming High Luminosity Large Hadron Collider. This work highlights the potential of quantum-enhanced models to address the computational demands of next-generation particle physics experiments.
QUANT-PHDec 30, 2024
Quantum Diffusion Model for Quark and Gluon Jet GenerationMariia Baidachna, Rey Guadarrama, Gopal Ramesh Dahale et al.
Diffusion models have demonstrated remarkable success in image generation, but they are computationally intensive and time-consuming to train. In this paper, we introduce a novel diffusion model that benefits from quantum computing techniques in order to mitigate computational challenges and enhance generative performance within high energy physics data. The fully quantum diffusion model replaces Gaussian noise with random unitary matrices in the forward process and incorporates a variational quantum circuit within the U-Net in the denoising architecture. We run evaluations on the structurally complex quark and gluon jets dataset from the Large Hadron Collider. The results demonstrate that the fully quantum and hybrid models are competitive with a similar classical model for jet generation, highlighting the potential of using quantum techniques for machine learning problems.
QUANT-PHNov 22, 2024
Lie-Equivariant Quantum Graph Neural NetworksJogi Suda Neto, Roy T. Forestano, Sergei Gleyzer et al.
Discovering new phenomena at the Large Hadron Collider (LHC) involves the identification of rare signals over conventional backgrounds. Thus binary classification tasks are ubiquitous in analyses of the vast amounts of LHC data. We develop a Lie-Equivariant Quantum Graph Neural Network (Lie-EQGNN), a quantum model that is not only data efficient, but also has symmetry-preserving properties. Since Lorentz group equivariance has been shown to be beneficial for jet tagging, we build a Lorentz-equivariant quantum GNN for quark-gluon jet discrimination and show that its performance is on par with its classical state-of-the-art counterpart LorentzNet, making it a viable alternative to the conventional computing paradigm.
IMOct 9, 2025
FlowLensing: Simulating Gravitational Lensing with Flow MatchingHamees Sayed, Pranath Reddy, Michael W. Toomey et al.
Gravitational lensing is one of the most powerful probes of dark matter, yet creating high-fidelity lensed images at scale remains a bottleneck. Existing tools rely on ray-tracing or forward-modeling pipelines that, while precise, are prohibitively slow. We introduce FlowLensing, a Diffusion Transformer-based compact and efficient flow-matching model for strong gravitational lensing simulation. FlowLensing operates in both discrete and continuous regimes, handling classes such as different dark matter models as well as continuous model parameters ensuring physical consistency. By enabling scalable simulations, our model can advance dark matter studies, specifically for probing dark matter substructure in cosmological surveys. We find that our model achieves a speedup of over 200$\times$ compared to classical simulators for intensive dark matter models, with high fidelity and low inference latency. FlowLensing enables rapid, scalable, and physically consistent image synthesis, offering a practical alternative to traditional forward-modeling pipelines.
MLAug 1, 2025
Sinusoidal Approximation Theorem for Kolmogorov-Arnold NetworksSergei Gleyzer, Hanh Nguyen, Dinesh P. Ramakrishnan et al.
The Kolmogorov-Arnold representation theorem states that any continuous multivariable function can be exactly represented as a finite superposition of continuous single variable functions. Subsequent simplifications of this representation involve expressing these functions as parameterized sums of a smaller number of unique monotonic functions. These developments led to the proof of the universal approximation capabilities of multilayer perceptron networks with sigmoidal activations, forming the alternative theoretical direction of most modern neural networks. Kolmogorov-Arnold Networks (KANs) have been recently proposed as an alternative to multilayer perceptrons. KANs feature learnable nonlinear activations applied directly to input values, modeled as weighted sums of basis spline functions. This approach replaces the linear transformations and sigmoidal post-activations used in traditional perceptrons. Subsequent works have explored alternatives to spline-based activations. In this work, we propose a novel KAN variant by replacing both the inner and outer functions in the Kolmogorov-Arnold representation with weighted sinusoidal functions of learnable frequencies. Inspired by simplifications introduced by Lorentz and Sprecher, we fix the phases of the sinusoidal activations to linearly spaced constant values and provide a proof of its theoretical validity. We also conduct numerical experiments to evaluate its performance on a range of multivariable functions, comparing it with fixed-frequency Fourier transform methods and multilayer perceptrons (MLPs). We show that it outperforms the fixed-frequency Fourier transform and achieves comparable performance to MLPs.
QUANT-PHJun 2, 2025
Probing Quantum Spin Systems with Kolmogorov-Arnold Neural Network Quantum StatesMahmud Ashraf Shamim, Eric A F Reinhardt, Talal Ahmed Chowdhury et al.
Neural Quantum States (NQS) are a class of variational wave functions parametrized by neural networks (NNs) to study quantum many-body systems. In this work, we propose \texttt{SineKAN}, a NQS \textit{ansatz} based on Kolmogorov-Arnold Networks (KANs), to represent quantum mechanical wave functions as nested univariate functions. We show that \texttt{SineKAN} wavefunction with learnable sinusoidal activation functions can capture the ground state energies, fidelities and various correlation functions of the one dimensional Transverse-Field Ising model, Anisotropic Heisenberg model, and Antiferromagnetic $J_{1}-J_{2}$ model with different chain lengths. In our study of the $J_1-J_2$ model with $L=100$ sites, we find that the \texttt{SineKAN} model outperforms several previously explored neural quantum state \textit{ansätze}, including Restricted Boltzmann Machines (RBMs), Long Short-Term Memory models (LSTMs), and Multi-layer Perceptrons (MLP) \textit{a.k.a.} Feed Forward Neural Networks, when compared to the results obtained from the Density Matrix Renormalization Group (DMRG) algorithm. We find that \texttt{SineKAN} models can be trained to high precisions and accuracies with minimal computational costs.
DATA-ANApr 19, 2021
End-to-End Jet Classification of Boosted Top Quarks with the CMS Open DataMichael Andrews, Bjorn Burkle, Yi-fan Chen et al.
We describe a novel application of the end-to-end deep learning technique to the task of discriminating top quark-initiated jets from those originating from the hadronization of a light quark or a gluon. The end-to-end deep learning technique combines deep learning algorithms and low-level detector representation of the high-energy collision event. In this study, we use low-level detector information from the simulated CMS Open Data samples to construct the top jet classifiers. To optimize classifier performance we progressively add low-level information from the CMS tracking detector, including pixel detector reconstructed hits and impact parameters, and demonstrate the value of additional tracking information even when no new spatial structures are added. Relying only on calorimeter energy deposits and reconstructed pixel detector hits, the end-to-end classifier achieves an AUC score of 0.975$\pm$0.002 for the task of classifying boosted top quark jets. After adding derived track quantities, the classifier AUC score increases to 0.9824$\pm$0.0013, serving as the first performance benchmark for these CMS Open Data samples. We additionally provide a timing performance comparison of different processor unit architectures for training the network.
HEP-EXApr 5, 2021
Graph Generative Models for Fast Detector Simulations in High Energy PhysicsAli Hariri, Darya Dyachkova, Sergei Gleyzer
Accurate and fast simulation of particle physics processes is crucial for the high-energy physics community. Simulating particle interactions with detectors is both time consuming and computationally expensive. With the proton-proton collision energy of 13 TeV, the Large Hadron Collider is uniquely positioned to detect and measure the rare phenomena that can shape our knowledge of new interactions. The High-Luminosity Large Hadron Collider (HL-LHC) upgrade will put a significant strain on the computing infrastructure due to increased event rate and levels of pile-up. Simulation of high-energy physics collisions needs to be significantly faster without sacrificing the physics accuracy. Machine learning approaches can offer faster solutions, while maintaining a high level of fidelity. We discuss a graph generative model that provides effective reconstruction of LHC events, paving the way for full detector level fast simulation for HL-LHC.
HEP-EXFeb 21, 2019
End-to-End Jet Classification of Quarks and Gluons with the CMS Open DataMichael Andrews, John Alison, Sitong An et al.
We describe the construction of end-to-end jet image classifiers based on simulated low-level detector data to discriminate quark- vs. gluon-initiated jets with high-fidelity simulated CMS Open Data. We highlight the importance of precise spatial information and demonstrate competitive performance to existing state-of-the-art jet classifiers. We further generalize the end-to-end approach to event-level classification of quark vs. gluon di-jet QCD events. We compare the fully end-to-end approach to using hand-engineered features and demonstrate that the end-to-end algorithm is robust against the effects of underlying event and pile-up.
DATA-ANJul 31, 2018
End-to-End Physics Event Classification with CMS Open Data: Applying Image-Based Deep Learning to Detector Data for the Direct Classification of Collision Events at the LHCMichael Andrews, Manfred Paulini, Sergei Gleyzer et al.
This paper describes the construction of novel end-to-end image-based classifiers that directly leverage low-level simulated detector data to discriminate signal and background processes in pp collision events at the Large Hadron Collider at CERN. To better understand what end-to-end classifiers are capable of learning from the data and to address a number of associated challenges, we distinguish the decay of the standard model Higgs boson into two photons from its leading background sources using high-fidelity simulated CMS Open Data. We demonstrate the ability of end-to-end classifiers to learn from the angular distribution of the photons recorded as electromagnetic showers, their intrinsic shapes, and the energy of their constituent hits, even when the underlying particles are not fully resolved, delivering a clear advantage in such cases over purely kinematics-based classifiers.
COMP-PHJul 8, 2018
Machine Learning in High Energy Physics Community White PaperKim Albertsson, Piero Altoe, Dustin Anderson et al.
Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.