LG AIJun 1, 2025

Bound by semanticity: universal laws governing the generalization-identification tradeoff

Marco Nurisso, Jesseba Fernando, Raj Deshpande, Alan Perotti, Raja Marjieh, Steven M. Frankland, Richard L. Lewis, Taylor W. Webb, Declan Campbell, Francesco Vaccarino, Jonathan D. Cohen, Giovanni Petri

arXiv:2506.14797v19.42 citationsh-index: 37

Originality Incremental advance

AI Analysis

This provides an exact theory of the generalization-identification trade-off, clarifying representational capacity limits for deep networks and brains, though it is incremental in extending analysis to complex models.

The paper tackles the fundamental tradeoff between generalization and identification in intelligent systems, deriving universal laws that predict a sharp collapse in multi-input processing capacity and a non-monotonic optimum for generalization probability, with empirical validation in deep networks showing close alignment to theoretical curves.

Intelligent systems must deploy internal representations that are simultaneously structured -- to support broad generalization -- and selective -- to preserve input identity. We expose a fundamental limit on this tradeoff. For any model whose representational similarity between inputs decays with finite semantic resolution $\varepsilon$, we derive closed-form expressions that pin its probability of correct generalization $p_S$ and identification $p_I$ to a universal Pareto front independent of input space geometry. Extending the analysis to noisy, heterogeneous spaces and to $n>2$ inputs predicts a sharp $1/n$ collapse of multi-input processing capacity and a non-monotonic optimum for $p_S$. A minimal ReLU network trained end-to-end reproduces these laws: during learning a resolution boundary self-organizes and empirical $(p_S,p_I)$ trajectories closely follow theoretical curves for linearly decaying similarity. Finally, we demonstrate that the same limits persist in two markedly more complex settings -- a convolutional neural network and state-of-the-art vision-language models -- confirming that finite-resolution similarity is a fundamental emergent informational constraint, not merely a toy-model artifact. Together, these results provide an exact theory of the generalization-identification trade-off and clarify how semantic resolution shapes the representational capacity of deep networks and brains alike.

View on arXiv PDF

Similar