CLJul 18, 2025
InTraVisTo: Inside Transformer Visualisation ToolNicolò Brunello, Davide Rigamonti, Andrea Sassella et al.
The reasoning capabilities of Large Language Models (LLMs) have increased greatly over the last few years, as have their size and complexity. Nonetheless, the use of LLMs in production remains challenging due to their unpredictable nature and discrepancies that can exist between their desired behavior and their actual model output. In this paper, we introduce a new tool, InTraVisTo (Inside Transformer Visualisation Tool), designed to enable researchers to investigate and trace the computational process that generates each token in a Transformer-based LLM. InTraVisTo provides a visualization of both the internal state of the Transformer model (by decoding token embeddings at each layer of the model) and the information flow between the various components across the different layers of the model (using a Sankey diagram). With InTraVisTo, we aim to help researchers and practitioners better understand the computations being performed within the Transformer model and thus to shed some light on internal patterns and reasoning processes employed by LLMs.
CLJan 19
A Shared Geometry of Difficulty in Multilingual Language ModelsStefano Civelli, Pietro Bernardelle, Nicolò Brunello et al.
Predicting problem-difficulty in large language models (LLMs) refers to estimating how difficult a task is according to the model itself, typically by training linear probes on its internal representations. In this work, we study the multilingual geometry of problem-difficulty in LLMs by training linear probes using the AMC subset of the Easy2Hard benchmark, translated into 21 languages. We found that difficulty-related signals emerge at two distinct stages of the model internals, corresponding to shallow (early-layers) and deep (later-layers) internal representations, that exhibit functionally different behaviors. Probes trained on deep representations achieve high accuracy when evaluated on the same language but exhibit poor cross-lingual generalization. In contrast, probes trained on shallow representations generalize substantially better across languages, despite achieving lower within-language performance. Together, these results suggest that LLMs first form a language-agnostic representation of problem difficulty, which subsequently becomes language-specific. This closely aligns with existing findings in LLM interpretability showing that models tend to operate in an abstract conceptual space before producing language-specific outputs. We demonstrate that this two-stage representational process extends beyond semantic content to high-level meta-cognitive properties such as problem-difficulty estimation.