CL AIMay 18, 2018

Robust Handling of Polysemy via Sparse Representations

arXiv:1805.07398v132.21114 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of handling multiple meanings in words for natural language processing, though it appears incremental as it builds on existing representation methods.

The paper tackles the problem of polysemy in words by proposing sparse distributed representations, which are shown to be more expressive than dense representations like Word2Vec, outperforming them in some analogy tasks despite using only two of three input terms.

Words are polysemous and multi-faceted, with many shades of meanings. We suggest that sparse distributed representations are more suitable than other, commonly used, (dense) representations to express these multiple facets, and present Category Builder, a working system that, as we show, makes use of sparse representations to support multi-faceted lexical representations. We argue that the set expansion task is well suited to study these meaning distinctions since a word may belong to multiple sets with a different reason for membership in each. We therefore exhibit the performance of Category Builder on this task, while showing that our representation captures at the same time analogy problems such as "the Ganga of Egypt" or "the Voldemort of Tolkien". Category Builder is shown to be a more expressive lexical representation and to outperform dense representations such as Word2Vec in some analogy classes despite being shown only two of the three input terms.

View on arXiv PDF

Similar