LGHEP-EXDATA-ANJan 12, 2025

Introduction to the Usage of Open Data from the Large Hadron Collider for Computer Scientists in the Context of Machine Learning

arXiv:2501.06896v13 citationsh-index: 2SciPost Physics Lecture Notes
Originality Synthesis-oriented
AI Analysis

This work addresses the data accessibility problem for computer scientists and physicists aiming to collaborate on machine learning in particle physics, but it is incremental as it focuses on format conversion without introducing new methods.

The study tackled the challenge of making Large Hadron Collider open data accessible to computer scientists by converting it from the ROOT format to pandas DataFrames, providing a foundation for interdisciplinary collaboration in machine learning applications.

Deep learning techniques have evolved rapidly in recent years, significantly impacting various scientific fields, including experimental particle physics. To effectively leverage the latest developments in computer science for particle physics, a strengthened collaboration between computer scientists and physicists is essential. As all machine learning techniques depend on the availability and comprehensibility of extensive data, clear data descriptions and commonly used data formats are prerequisites for successful collaboration. In this study, we converted open data from the Large Hadron Collider, recorded in the ROOT data format commonly used in high-energy physics, to pandas DataFrames, a well-known format in computer science. Additionally, we provide a brief introduction to the data's content and interpretation. This paper aims to serve as a starting point for future interdisciplinary collaborations between computer scientists and physicists, fostering closer ties and facilitating efficient knowledge exchange.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes