CL LGApr 11, 2019

Cross-topic distributional semantic representations via unsupervised mappings

Eleftheria Briakou, Nikos Athanasiou, Alexandros Potamianos

arXiv:1904.05674v131.01091 citations

Originality Incremental advance

AI Analysis

This addresses the issue of word sense conflation in semantic models for NLP applications, representing an incremental improvement over existing methods.

The paper tackles the problem of polysemy in Distributional Semantic Models by learning multiple word representations based on different topics and aligning them into a common space, achieving state-of-the-art results in contextual word similarity and outperforming single-prototype models in downstream NLP tasks.

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a word based on different topics. First, a separate DSM is trained for each topic and then each of the topic-based DSMs is aligned to a common vector space. Our unsupervised mapping approach is motivated by the hypothesis that words preserving their relative distances in different topic semantic sub-spaces constitute robust \textit{semantic anchors} that define the mappings between them. Aligned cross-topic representations achieve state-of-the-art results for the task of contextual word similarity. Furthermore, evaluation on NLP downstream tasks shows that multiple topic-based embeddings outperform single-prototype models.

View on arXiv PDF

Similar