LG AINov 17, 2025

Transformer Injectivity & Geometric Robustness - Analytic Margins and Bi-Lipschitz Uniformity of Sequence-Level Hidden States

arXiv:2511.14808v1

Originality Incremental advance

AI Analysis

This work addresses the theoretical and practical robustness of Transformer representations for AI researchers, providing foundational insights into model invertibility and geometric stability.

The paper tackles the problem of understanding when Transformer models produce unique hidden states for different prompts, proving that under analytic assumptions, they are generically injective and remain so during training, and empirically showing no collisions in full precision or 8-bit quantization but some in 4-bit quantization with reduced co-Lipschitz estimates.

Under real-analytic assumptions on decoder-only Transformers, recent work shows that the map from discrete prompts to last-token hidden states is generically injective on finite prompt sets. We refine this picture: for each layer $\ell$ we define a collision discriminant $Δ^\ell \subset Θ$ and injective stratum $U^\ell = Θ\setminus Δ^\ell$, and prove a dichotomy -- either the model is nowhere injective on the set, or $U^\ell$ is open and dense and every $F^\ell_θ$ is injective. Under mild non-singularity assumptions on the optimizer and an absolutely continuous initialization, generic injectivity persists along smooth training trajectories over any fixed horizon. We also treat symmetry groups $G$, showing that discriminants and injective strata descend to the quotient $Θ/G$, so injectivity is naturally a property of functional equivalence classes. We complement these results with an empirical study of layerwise geometric diagnostics. We define a separation margin and a co-Lipschitz (lower Lipschitz) constant between prompt space and last-token representation space, estimated via nearest-neighbor statistics on large prompt sets. Applying these diagnostics to pretrained LLaMA-3 and Qwen models, we study behavior across layers, sequence lengths, model scales, and 8- and 4-bit activation quantization. On our sampled prompts we see no collisions in full precision or at 8 bits, while 4-bit quantization induces a small number of collisions and markedly shrinks co-Lipschitz estimates. For a small GPT-2 trained from scratch, normalized metrics remain stable over training. Overall, the results suggest that Transformer representations are generically and persistently injective in the continuous-parameter idealization, while their practical invertibility can be probed using simple geometric diagnostics.

View on arXiv PDF

Similar