Enhancing next token prediction based pre-training for jet foundation models

arXiv:2512.04149v13 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses incremental improvements in jet physics simulations for researchers in high-energy physics.

The paper tackled the problem of improving next token prediction pre-training for jet foundation models by introducing a hybrid input setup and combining masked particle modeling with generative objectives, resulting in enhanced downstream classification performance without compromising generative capabilities.

Next token prediction is an attractive pre-training task for jet foundation models, in that it is simulation free and enables excellent generative capabilities that can transfer across datasets. Here we study multiple improvements to next token prediction, building on the initial work of OmniJet-$α$. Instead of tokenizing particles and subsequently only using the token-ID as the model input for both the generative and the classification task, we adopt a hybrid setup, which allows us to use continuous feature vectors as model input while only using token-IDs in the next token prediction target. Secondly, we explore a combined pre-training strategy that combines masked particle modeling and generative learning objectives. Taken together, these changes greatly improve the performance in downstream classification tasks without any loss in generative performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes