Data-Driven, Geometry-Aware Optimal-Transport Calibration of Flavor Tagger

arXiv:2605.013631.3h-index: 1
Predicted impact top 99% in HEP-EX · last 90 daysOriginality Incremental advance
AI Analysis

For high-energy physics analyses requiring precise flavor tagging, this method provides a more accurate and continuous calibration, reducing information loss compared to existing discrete approaches.

This work addresses the problem of continuous, event-level calibration of flavor taggers across their full multicomponent outputs, which is currently limited to discrete working points or binned corrections. The proposed geometry-aware optimal transport framework, using isometric log-ratio coordinates and an EM technique with normalizing flows, achieves improved closure in control regions and independent validation mixtures.

Flavor-tagging calibrations are often provided either as scale factors measured at a finite set of working points or as binned corrections to a chosen one-dimensional discriminant. However, this approach falls short of providing continuous, event-level calibration across the full multicomponent outputs of modern taggers. This limitation leads to information loss in analyses that demand high-performance flavor tagging, restricting analyses to a limited set of predefined variables. In this work, we propose a geometry-aware framework that formulates flavor-tagger calibration as an optimal transport problem on the probability simplex. The transport maps are parameterized and trained in the isometric log-ratio coordinate system. Because the quadratic Euclidean cost of Brenier transport in this coordinate system is equivalent to the Aitchison distance on the simplex, the learned map induces a minimal deformation under the Aitchison geometry. Furthermore, we extract flavor-conditional target distributions directly from control-region data using an expectation-maximization (EM) technique that simultaneously fits multiple control regions, models each flavor component with a normalizing flow, and estimates the regional mixture fractions. The extracted targets are subsequently used to learn flavor-factorized transport maps. Because the joint estimation of mixture fractions and flexible component densities admits weakly constrained directions, we further introduce a linearized feedback-operator analysis that propagates the fitted composition covariance into the extracted component densities, separating data-constrained modes from those dominated by the composition prior. The simulation-based closure study demonstrates improved closure in dedicated control regions and in independent validation mixtures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes