Spelling-out is not Straightforward: LLMs' Capability of Tokenization from Token to Characters
This work addresses a fundamental limitation in LLMs' tokenization capabilities, which is incremental as it builds on existing understanding of model internals.
The study investigated how large language models (LLMs) handle character-level information during token spelling-out, finding that they struggle with complex tasks like identifying subcomponents and rely on higher Transformer layers rather than the embedding layer for reconstruction.
Large language models (LLMs) can spell out tokens character by character with high accuracy, yet they struggle with more complex character-level tasks, such as identifying compositional subcomponents within tokens. In this work, we investigate how LLMs internally represent and utilize character-level information during the spelling-out process. Our analysis reveals that, although spelling out is a simple task for humans, it is not handled in a straightforward manner by LLMs. Specifically, we show that the embedding layer does not fully encode character-level information, particularly beyond the first character. As a result, LLMs rely on intermediate and higher Transformer layers to reconstruct character-level knowledge, where we observe a distinct "breakthrough" in their spelling behavior. We validate this mechanism through three complementary analyses: probing classifiers, identification of knowledge neurons, and inspection of attention weights.