LG HEP-PHMar 28, 2025

Learnable cut flow for high energy physics

arXiv:2503.22498v44.1h-index: 5Has CodeJournal of High Energy Physics

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable and automated feature selection in high energy physics, offering a hybrid approach that is incremental in merging existing methods.

The paper tackles the problem of combining the interpretability of traditional cut flow methods with the power of neural networks in high energy physics by proposing the Learnable Cut Flow (LCF), which transforms cut selection into a differentiable, data-driven process and demonstrates accurate boundary learning and robust feature handling on mock and real datasets, including a diboson vs. QCD dataset where it initially underperforms other methods.

Neural networks have emerged as a powerful paradigm for tasks in high energy physics, yet their opaque training process renders them as a black box. In contrast, the traditional cut flow method offers simplicity and interpretability but requires extensive manual tuning to identify optimal cut boundaries. To merge the strengths of both approaches, we propose the Learnable Cut Flow (LCF), a neural network that transforms the traditional cut selection into a fully differentiable, data-driven process. LCF implements two cut strategies-parallel, where observable distributions are treated independently, and sequential, where prior cuts shape subsequent ones-to flexibly determine optimal boundaries. Building on this strategy, we introduce the Learnable Importance, a metric that quantifies feature importance and adjusts their contributions to the loss accordingly, offering model-driven insights unlike ad-hoc metrics. To ensure differentiability, a modified loss function replaces hard cuts with mask operations, preserving data shape throughout the training process. LCF is tested on six varied mock datasets and a realistic diboson vs. QCD dataset. Results demonstrate that LCF 1. accurately learns cut boundaries across typical feature distributions in both parallel and sequential strategies, 2. assigns higher importance to discriminative features with minimal overlap, 3. handles redundant or correlated features robustly, and 4. performs effectively in real-world scenarios. In the diboson dataset, LCF initially underperforms boosted decision trees and multiplayer perceptrons when using all observables. LCF bridges the gap between traditional cut flow method and modern black-box neural networks, delivering actionable insights into the training process and feature importance. Source code and experimental data are available at https://github.com/Star9daisy/learnable-cut-flow.

View on arXiv PDF Code

Similar