HEP-EX CVDec 1, 2025

Panda: Self-distillation of Reusable Sensor-level Representations for High Energy Physics

arXiv:2512.01324v12.31 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the challenge of reducing reliance on labeled data and complex pipelines for high-energy physics experiments, offering a more efficient approach for neutrino and rare-event detection, though it appears incremental as it builds on existing self-distillation and representation learning methods.

The paper tackles the problem of physics reconstruction in liquid argon time projection chambers (LArTPCs), which typically relies on complex, detector-specific pipelines requiring extensive labeled simulation, by introducing Panda, a model that learns reusable sensor-level representations from raw unlabeled data. The result shows that Panda substantially improves label efficiency and reconstruction quality, beating the previous state-of-the-art semantic segmentation model with 1,000× fewer labels, and enables particle identification comparable to SOTA tools with a much smaller model.

Liquid argon time projection chambers (LArTPCs) provide dense, high-fidelity 3D measurements of particle interactions and underpin current and future neutrino and rare-event experiments. Physics reconstruction typically relies on complex detector-specific pipelines that use tens of hand-engineered pattern recognition algorithms or cascades of task-specific neural networks that require extensive, labeled simulation that requires a careful, time-consuming calibration process. We introduce \textbf{Panda}, a model that learns reusable sensor-level representations directly from raw unlabeled LArTPC data. Panda couples a hierarchical sparse 3D encoder with a multi-view, prototype-based self-distillation objective. On a simulated dataset, Panda substantially improves label efficiency and reconstruction quality, beating the previous state-of-the-art semantic segmentation model with 1,000$\times$ fewer labels. We also show that a single set-prediction head 1/20th the size of the backbone with no physical priors trained on frozen outputs from Panda can result in particle identification that is comparable with state-of-the-art (SOTA) reconstruction tools. Full fine-tuning further improves performance across all tasks.

View on arXiv PDF

Similar