Word Meanings in Transformer Language Models
This addresses a fundamental question in NLP about how transformer models represent word meanings, with implications for interpretability and cognitive modeling, though it is incremental in building on existing embedding analysis methods.
The study investigated whether transformer language models like RoBERTa-base encode semantic information in a lexical store-like manner by clustering token embeddings and analyzing them for semantic sensitivity and psycholinguistic measures. The findings strongly support that a wide variety of semantic information is encoded, ruling out meaning eliminativist hypotheses.
We investigate how word meanings are represented in the transformer language models. Specifically, we focus on whether transformer models employ something analogous to a lexical store - where each word has an entry that contains semantic information. To do this, we extracted the token embedding space of RoBERTa-base and k-means clustered it into 200 clusters. In our first study, we then manually inspected the resultant clusters to consider whether they are sensitive to semantic information. In our second study, we tested whether the clusters are sensitive to five psycholinguistic measures: valence, concreteness, iconicity, taboo, and age of acquisition. Overall, our findings were very positive - there is a wide variety of semantic information encoded within the token embedding space. This serves to rule out certain "meaning eliminativist" hypotheses about how transformer LLMs process semantic information.