LG MLMay 8

When Symbol Names Should Not Matter: A Logistic Theory of Fresh-Symbol Classification

arXiv:2605.0712014.9

Predicted impact top 87% in LG · last 90 daysOriginality Incremental advance

AI Analysis

Provides a theoretical framework for understanding when transformers can achieve abstract symbol generalization in classification tasks, refining prior diversity conditions.

The paper studies fixed-label classification in template tasks, where models must learn decision rules invariant to symbol renaming. It analyzes regularized kernel logistic classification in the transformer-kernel regime, proving margin-transfer guarantees for fresh-symbol classification and showing that collision geometry, not just vocabulary size, determines generalization.

Template tasks have emerged as a clean testbed for asking whether transformers reason with abstract symbols rather than concrete token names. We study the fixed-label classification version of this problem, where train and test examples share latent templates but may use disjoint vocabularies. Unlike next-token prediction, the model need not emit unseen symbols; it must learn a decision rule invariant to symbol renaming. We analyze regularized kernel logistic classification in the transformer-kernel regime. Our main result decomposes the learned predictor into an ideal template-level classifier and a finite-sample perturbation caused by accidental token overlaps in the training data. We encode these overlaps by a colored collision graph and prove high-probability margin-transfer guarantees for fresh-symbol classification. This perspective extends template-based analyses to logistic classification and refines scalar diversity conditions: vocabulary size controls the average rate of collisions, but collision geometry controls whether the ideal classification margin is preserved. More broadly, the same perturbation framework applies to abstraction-augmented inputs, yielding a general margin-versus-collision criterion for identifying when prompting strategies improve fresh-symbol generalization. Synthetic template experiments illustrate the predicted roles of regularization, sample size, and transformer-kernel structure.

View on arXiv PDF

Similar