Synthetic Categorical Restructuring large Or How AIs Gradually Extract Efficient Regularities from Their Experience of the World
This addresses a fundamental problem in AI interpretability for researchers, but it is incremental as it focuses on a specific visualization tool.
The study investigates how language models segment and restructure internal categories to improve efficiency, visualizing the process in GPT2-XL's early layers.
How do language models segment their internal experience of the world of words to progressively learn to interact with it more efficiently? This study in the neuropsychology of artificial intelligence investigates the phenomenon of synthetic categorical restructuring, a process through which each successive perceptron neural layer abstracts and combines relevant categorical sub-dimensions from the thought categories of its previous layer. This process shapes new, even more efficient categories for analyzing and processing the synthetic system's own experience of the linguistic external world to which it is exposed. Our genetic neuron viewer, associated with this study, allows visualization of the synthetic categorical restructuring phenomenon occurring during the transition from perceptron layer 0 to 1 in GPT2-XL.