HEP-PH LGDec 30, 2025

Quantitative Understanding of PDF Fits and their Uncertainties

Amedeo Chiefa, Luigi Del Debbio, Richard Kenway

arXiv:2512.24116v21.21 citationsh-index: 53

Originality Incremental advance

AI Analysis

This work provides a diagnostic tool for assessing the robustness of PDF fitting methodologies in particle physics, with potential broader applications in machine learning, though it is incremental as it builds on existing ML techniques without introducing a new paradigm.

The authors tackled the problem of understanding and quantifying uncertainties in Parton Distribution Function (PDF) fits, which are crucial for high-precision collider experiments, by developing a theoretical framework based on the Neural Tangent Kernel (NTK) to analyze neural network training dynamics, enabling an analytical description of how uncertainties propagate from data to fitted functions.

Parton Distribution Functions (PDFs) play a central role in describing experimental data at colliders and provide insight into the structure of nucleons. As the LHC enters an era of high-precision measurements, a robust PDF determination with a reliable uncertainty quantification has become mandatory in order to match the experimental precision. The NNPDF collaboration has pioneered the use of Machine Learning (ML) techniques for PDF determinations, using Neural Networks (NNs) to parametrise the unknown PDFs in a flexible and unbiased way. The NNs are then trained on experimental data by means of stochastic gradient descent algorithms. The statistical robustness of the results is validated by extensive closure tests using synthetic data. In this work, we develop a theoretical framework based on the Neural Tangent Kernel (NTK) to analyse the training dynamics of neural networks. This approach allows us to derive, under precise assumptions, an analytical description of the neural network evolution during training, enabling a quantitative understanding of the training process. Having an analytical handle on the training dynamics allows us to clarify the role of the NN architecture and the impact of the experimental data in a transparent way. Similarly, we are able to describe the evolution of the covariance of the NN output during training, providing a quantitative description of how uncertainties are propagated from the data to the fitted function. While our results are not a substitute for PDF fitting, they do provide a powerful diagnostic tool to assess the robustness of current fitting methodologies. Beyond its relevance for particle physics phenomenology, our analysis of PDF determinations provides a testbed to apply theoretical ideas about the learning process developed in the ML community.

View on arXiv PDF

Similar