LG AI HEP-EXAug 13, 2025

FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics

David Park, Shuhang Li, Yi Huang, Xihaier Luo, Haiwang Yu, Yeonju Go, Christopher Pinkenburg, Yuewei Lin, Shinjae Yoo, Joseph Osborn, Jin Huang, Yihui Ren

arXiv:2508.14087v111.43 citationsh-index: 54

Originality Incremental advance

AI Analysis

This work addresses the problem of scaling and generalizing foundation models for particle physics, which is incremental as it adapts existing FM paradigms to a specific scientific domain with sparse detector data.

The paper tackles the challenge of applying foundation models to experimental particle physics by introducing a new dataset with over 11 million particle collision events and a novel self-supervised training method, demonstrating that the model scales up to 188 million parameters and consistently outperforms baselines across diverse downstream tasks with frozen weights and task-specific adapters.

Large language models have revolutionized artificial intelligence by enabling large, generalizable models trained through self-supervision. This paradigm has inspired the development of scientific foundation models (FMs). However, applying this capability to experimental particle physics is challenging due to the sparse, spatially distributed nature of detector data, which differs dramatically from natural language. This work addresses if an FM for particle physics can scale and generalize across diverse tasks. We introduce a new dataset with more than 11 million particle collision events and a suite of downstream tasks and labeled data for evaluation. We propose a novel self-supervised training method for detector data and demonstrate its neural scalability with models that feature up to 188 million parameters. With frozen weights and task-specific adapters, this FM consistently outperforms baseline models across all downstream tasks. The performance also exhibits robust data-efficient adaptation. Further analysis reveals that the representations extracted by the FM are task-agnostic but can be specialized via a single linear mapping for different downstream tasks.

View on arXiv PDF

Similar