IRApr 29, 2020

Complementing Lexical Retrieval with Semantic Residual Embedding

Luyu Gao, Zhuyun Dai, Tongfei Chen, Zhen Fan, Benjamin Van Durme, Jamie Callan

arXiv:2004.13969v326.289 citations

Originality Incremental advance

AI Analysis

This work addresses the limitation of lexical retrieval in capturing semantic information for information retrieval tasks, offering a hybrid approach that enhances retrieval pipelines.

The paper tackled the problem of complementing lexical retrieval models like BM25 with semantic matching by introducing CLEAR, a retrieval model that uses a residual-based embedding learning method to encode language structures and semantics, resulting in substantial improvements in end-to-end accuracy and efficiency over state-of-the-art models.

This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model. CLEAR explicitly trains the neural embedding to encode language structures and semantics that lexical retrieval fails to capture with a novel residual-based embedding learning method. Empirical evaluations demonstrate the advantages of CLEAR over state-of-the-art retrieval models, and that it can substantially improve the end-to-end accuracy and efficiency of reranking pipelines.

View on arXiv PDF

Similar