CLFeb 14, 2023

A Psycholinguistic Analysis of BERT's Representations of Compounds

arXiv:2302.07232v1270 citationsh-index: 9
AI Analysis

This provides insights into BERT's handling of fine-grained semantics and human compound representation, but it is incremental as it builds on existing word-level studies.

The study investigated whether BERT's semantic representations for compounds (e.g., sunlight) align with human intuitions, using psycholinguistic measures like lexeme meaning dominance and semantic transparency, and found moderate alignment, with contextualized layers performing best.

This work studies the semantic representations learned by BERT for compounds, that is, expressions such as sunlight or bodyguard. We build on recent studies that explore semantic information in Transformers at the word level and test whether BERT aligns with human semantic intuitions when dealing with expressions (e.g., sunlight) whose overall meaning depends -- to a various extent -- on the semantics of the constituent words (sun, light). We leverage a dataset that includes human judgments on two psycholinguistic measures of compound semantic analysis: lexeme meaning dominance (LMD; quantifying the weight of each constituent toward the compound meaning) and semantic transparency (ST; evaluating the extent to which the compound meaning is recoverable from the constituents' semantics). We show that BERT-based measures moderately align with human intuitions, especially when using contextualized representations, and that LMD is overall more predictable than ST. Contrary to the results reported for 'standard' words, higher, more contextualized layers are the best at representing compound meaning. These findings shed new light on the abilities of BERT in dealing with fine-grained semantic phenomena. Moreover, they can provide insights into how speakers represent compounds.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes