IR CLSep 4, 2019

Affect Enriched Word Embeddings for News Information Retrieval

arXiv:1909.01772v13.11 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the synonym/antonym issue in word embeddings for news information retrieval, offering an incremental improvement over existing methods.

The paper tackled the problem of word embeddings incorrectly clustering synonyms and antonyms by using affect-enriched embeddings (Aff2Vec) to improve news information retrieval tasks, achieving better results in query expansion and ranking on the New York Times dataset compared to other models.

Distributed representations of words have shown to be useful to improve the effectiveness of IR systems in many sub-tasks like query expansion, retrieval and ranking. Algorithms like word2vec, GloVe and others are also key factors in many improvements in different NLP tasks. One common issue with such embedding models is that words like happy and sad appear in similar contexts and hence are wrongly clustered close in the embedding space. In this paper we leverage Aff2Vec, a set of word embeddings models which include affect information, in order to better capture the affect aspect in news text to achieve better results in information retrieval tasks, also such embeddings are less hit by the synonym/antonym issue. We evaluate their effectiveness on two IR related tasks (query expansion and ranking) over the New York Times dataset (TREC-core '17) comparing them against other word embeddings based models and classic ranking models.

View on arXiv PDF

Similar