HEP-EXLGDATA-ANMLOct 17, 2019

Machine Learning on sWeighted Data

arXiv:1912.02590v1
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck for researchers in high energy physics who use sPlot techniques, offering a practical solution to integrate machine learning methods.

The paper tackled the problem of applying machine learning to sWeighted data with negative weights, which cause unbounded loss functions and divergent training, by proposing a mathematically rigorous transformation of sPlot weights into class probabilities conditioned on observables, enabling the use of any machine learning algorithm without modification.

Data analysis in high energy physics has to deal with data samples produced from different sources. One of the most widely used ways to unfold their contributions is the sPlot technique. It uses the results of a maximum likelihood fit to assign weights to events. Some weights produced by sPlot are by design negative. Negative weights make it difficult to apply machine learning methods. The loss function becomes unbounded. This leads to divergent neural network training. In this paper we propose a mathematically rigorous way to transform the weights obtained by sPlot into class probabilities conditioned on observables, thus enabling to apply any machine learning algorithm out-of-the-box.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes