Shades of meaning: Uncovering the geometry of ambiguous word representations through contextualised language models
This provides quantitative support for psychological theories of lexical ambiguity, though it is incremental as it applies existing models to a known problem.
The study tackled the challenge of lexical ambiguity by analyzing contextual language models, finding that their representations distinguish between unambiguous, homonymous, and polysemous words in ways that align with lexicographic and psychological theories.
Lexical ambiguity presents a profound and enduring challenge to the language sciences. Researchers for decades have grappled with the problem of how language users learn, represent and process words with more than one meaning. Our work offers new insight into psychological understanding of lexical ambiguity through a series of simulations that capitalise on recent advances in contextual language models. These models have no grounded understanding of the meanings of words at all; they simply learn to predict words based on the surrounding context provided by other words. Yet, our analyses show that their representations capture fine-grained meaningful distinctions between unambiguous, homonymous, and polysemous words that align with lexicographic classifications and psychological theorising. These findings provide quantitative support for modern psychological conceptualisations of lexical ambiguity and raise new challenges for understanding of the way that contextual information shapes the meanings of words across different timescales.