CLApr 1, 2022

Sense disambiguation of compound constituents

arXiv:2204.00429v1h-index: 24
Originality Incremental advance
AI Analysis

This addresses a specific issue in computational linguistics for improving semantic analysis of compounds, but it is incremental as it modifies an existing method.

The paper tackles the problem of word sense disambiguation for noun-noun compound constituents, such as 'star' in 'starfish', by adapting a set expansion method originally designed for analogies, and reports successful results on a dataset of nearly 9000 compounds, though performance varies with compound frequency.

In distributional semantic accounts of the meaning of noun-noun compounds (e.g. starfish, bank account, houseboat) the important role of constituent polysemy remains largely unaddressed(cf. the meaning of star in starfish vs. star cluster vs. star athlete). Instead of semantic vectors that average over the different meanings of a constituent, disambiguated vectors of the constituents would be needed in order to see what these more specific constituent meanings contribute to the meaning of the compound as a whole. This paper presents a novel approach to this specific problem of word sense disambiguation: set expansion. We build on the approach developed by Mahabal et al. (2018) which was originally designed to solve the analogy problem. We modified their method in such a way that it can address the problem of sense disambiguation of compound constituents. The results of experiments with a data set of almost 9000 compounds (LADEC, Gagné et al. 2019) suggest that this approach is successful, yet the success is sensitive to the frequency with which the compounds are attested.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes