IRJan 11, 2018

Enhancing Translation Language Models with Word Embedding for Information Retrieval

Jibril Frej, Jean-Pierre Chevallet, Didier Schwab

arXiv:1801.03844v11.7

Originality Synthesis-oriented

AI Analysis

This is an incremental approach for improving information retrieval systems, though it did not achieve significant gains.

The paper tackled the term mismatch problem in Information Retrieval by enhancing Translation Language Models with Word Embedding, but the results did not show statistically significant improvement compared to classical Language Models.

In this paper, we explore the usage of Word Embedding semantic resources for Information Retrieval (IR) task. This embedding, produced by a shallow neural network, have been shown to catch semantic similarities between words (Mikolov et al., 2013). Hence, our goal is to enhance IR Language Models by addressing the term mismatch problem. To do so, we applied the model presented in the paper Integrating and Evaluating Neural Word Embedding in Information Retrieval by Zuccon et al. (2015) that proposes to estimate the translation probability of a Translation Language Model using the cosine similarity between Word Embedding. The results we obtained so far did not show a statistically significant improvement compared to classical Language Model.

View on arXiv PDF

Similar