LGFeb 1, 2023
Versatile Energy-Based Probabilistic Models for High Energy PhysicsTaoli Cheng, Aaron Courville
As a classical generative modeling approach, energy-based models have the natural advantage of flexibility in the form of the energy function. Recently, energy-based models have achieved great success in modeling high-dimensional data in computer vision and natural language processing. In line with these advancements, we build a multi-purpose energy-based probabilistic model for High Energy Physics events at the Large Hadron Collider. This framework builds on a powerful generative model and describes higher-order inter-particle interactions. It suits different encoding architectures and builds on implicit generation. As for applicative aspects, it can serve as a powerful parameterized event generator for physics simulation, a generic anomalous signal detector free from spurious correlations, and an augmented event classifier for particle identification.
MLOct 24, 2022
Bridging Machine Learning and Sciences: Opportunities and ChallengesTaoli Cheng
The application of machine learning in sciences has seen exciting advances in recent years. As a widely applicable technique, anomaly detection has been long studied in the machine learning community. Especially, deep neural nets-based out-of-distribution detection has made great progress for high-dimensional data. Recently, these techniques have been showing their potential in scientific disciplines. We take a critical look at their applicative prospects including data universality, experimental protocols, model robustness, etc. We discuss examples that display transferable practices and domain-specific challenges simultaneously, providing a starting point for establishing a novel interdisciplinary research paradigm in the near future.
HEP-PHJul 3, 2020Code
Variational Autoencoders for Anomalous Jet TaggingTaoli Cheng, Jean-François Arguin, Julien Leissner-Martin et al.
We present a detailed study on Variational Autoencoders (VAEs) for anomalous jet tagging at the Large Hadron Collider. By taking in low-level jet constituents' information, and training with background QCD jets in an unsupervised manner, the VAE is able to encode important information for reconstructing jets, while learning an expressive posterior distribution in the latent space. When using the VAE as an anomaly detector, we present different approaches to detect anomalies: directly comparing in the input space or, instead, working in the latent space. In order to facilitate general search approaches such as bump-hunt, mass-decorrelated VAEs based on distance correlation regularization are also studied. We find that the naive mass-decorrelated VAEs fail at maintaining proper detection performance, by assigning higher probabilities to some anomalous samples. To build a performant mass-decorrelated anomalous jet tagger, we propose the Outlier Exposed VAE (OE-VAE), for which some outlier samples are introduced in the training process to guide the learned information. OE-VAEs are employed to achieve two goals at the same time: increasing sensitivity of outlier detection and decorrelating jet mass from the anomaly score. We succeed in reaching excellent results from both aspects. Code implementation of this work can be found at https://github.com/taolicheng/VAE-Jet
HEP-PHJan 18, 2022
Invariant Representation Driven Neural Classifier for Anti-QCD Jet TaggingTaoli Cheng, Aaron Courville
We leverage representation learning and the inductive bias in neural-net-based Standard Model jet classification tasks, to detect non-QCD signal jets. In establishing the framework for classification-based anomaly detection in jet physics, we demonstrate that, with a \emph{well-calibrated} and \emph{powerful enough feature extractor}, a well-trained \emph{mass-decorrelated} supervised Standard Model neural jet classifier can serve as a strong generic anti-QCD jet tagger for effectively reducing the QCD background. Imposing \emph{data-augmented} mass-invariance (and thus decoupling the dominant factor) not only facilitates background estimation, but also induces more substructure-aware representation learning. We are able to reach excellent tagging efficiencies for all the test signals considered. In the best case, we reach a background rejection rate of 51 and a significance improvement factor of 3.6 at 50 \% signal acceptance, with the jet mass decorrelated. This study indicates that supervised Standard Model jet classifiers have great potential in general new physics searches.
HEP-PHNov 5, 2019
Interpretability Study on Deep Learning for Jet Physics at the Large Hadron ColliderTaoli Cheng
Using deep neural networks for identifying physics objects at the Large Hadron Collider (LHC) has become a powerful alternative approach in recent years. After successful training of deep neural networks, examining the trained networks not only helps us understand the behaviour of neural networks, but also helps improve the performance of deep learning models through proper interpretation. We take jet tagging problem at the LHC as an example, using recursive neural networks as a starting point, aim at a thorough understanding of the behaviour of the physics-oriented DNNs and the information encoded in the embedding space. We make a comparative study on a series of different jet tagging tasks dominated by different underlying physics. Interesting observations on the latent space are obtained.
HEP-PHNov 7, 2017
Recursive Neural Networks in Quark/Gluon TaggingTaoli Cheng
Since the machine learning techniques are improving rapidly, it has been shown that the image recognition techniques in deep neural networks can be used to detect jet substructure. And it turns out that deep neural networks can match or outperform traditional approach of expert features. However, there are disadvantages such as sparseness of jet images. Based on the natural tree-like structure of jet sequential clustering, the recursive neural networks (RecNNs), which embed jet clustering history recursively as in natural language processing, have a better behavior when confronted with these problems. We thus try to explore the performance of RecNNs in quark/gluon discrimination. The results show that RecNNs work better than the baseline boosted decision tree (BDT) by a few percent in gluon rejection rate. However, extra implementation of particle flow identification only increases the performance slightly. We also experimented on some relevant aspects which might influence the performance of the networks. It shows that even taking only particle flow identification as input feature without any extra information on momentum or angular position is already giving a fairly good result, which indicates that the most of the information for quark/gluon discrimination is already included in the tree-structure itself. As a bonus, a rough up/down quark jets discrimination is also explored.