LGNov 26, 2025

Visualizing LLM Latent Space Geometry Through Dimensionality Reduction

Alex Ning, Vainateya Rangaraju, Yen-Ling Kuo

arXiv:2511.21594v23 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses interpretability challenges for researchers and practitioners using LLMs, though it appears incremental as it applies existing visualization methods to new model contexts.

The researchers tackled the problem of interpreting LLM internal mechanisms by extracting and visualizing latent state geometries in Transformer-based models using dimensionality reduction techniques like PCA and UMAP, uncovering novel geometric patterns such as separation between attention and MLP components across intermediate layers.

Large language models (LLMs) achieve state-of-the-art results across many natural language tasks, but their internal mechanisms remain difficult to interpret. In this work, we extract, process, and visualize latent state geometries in Transformer-based language models through dimensionality reduction. We capture layerwise activations at multiple points within Transformer blocks and enable systematic analysis through Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP). We demonstrate experiments on GPT-2 and LLaMa models, where we uncover interesting geometric patterns in latent space. Notably, we identify a clear separation between attention and MLP component outputs across intermediate layers, a pattern not documented in prior work to our knowledge. We also characterize the high norm of latent states at the initial sequence position and visualize the layerwise evolution of latent states. Additionally, we demonstrate the high-dimensional helical structure of GPT-2's positional embeddings and the sequence-wise geometric patterns in LLaMa. We make our code available at https://github.com/Vainateya/Feature_Geometry_Visualization.

View on arXiv PDF Code

Similar