Characterizing the Role of Similarity in the Property Inferences of Language Models
This provides insight into the conceptual structure of language models, potentially informing psycholinguistic experiments, but it is incremental as it applies existing methods to analyze a known phenomenon.
The study investigated how language models perform property inheritance, finding that they are more likely to project novel properties between categories when they are both taxonomically related and highly similar, rather than relying on one mechanism alone.
Property inheritance -- a phenomenon where novel properties are projected from higher level categories (e.g., birds) to lower level ones (e.g., sparrows) -- provides a unique window into how humans organize and deploy conceptual knowledge. It is debated whether this ability arises due to explicitly stored taxonomic knowledge vs. simple computations of similarity between mental representations. How are these mechanistic hypotheses manifested in contemporary language models? In this work, we investigate how LMs perform property inheritance with behavioral and causal representational analysis experiments. We find that taxonomy and categorical similarities are not mutually exclusive in LMs' property inheritance behavior. That is, LMs are more likely to project novel properties from one category to the other when they are taxonomically related and at the same time, highly similar. Our findings provide insight into the conceptual structure of language models and may suggest new psycholinguistic experiments for human subjects.