IRJul 24, 2017

A Deep Investigation of Deep IR Models

arXiv:1707.07700v127 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the black-box nature of deep IR models for researchers and practitioners, offering incremental insights through comparative analysis.

The paper investigates deep information retrieval models by comparing automatically learned features with hand-crafted features and analyzing differences between representation-focused and interaction-focused models, revealing disadvantages and providing guidelines for improvement based on empirical studies on Robust and LETOR4.0 datasets.

The effective of information retrieval (IR) systems have become more important than ever. Deep IR models have gained increasing attention for its ability to automatically learning features from raw text; thus, many deep IR models have been proposed recently. However, the learning process of these deep IR models resemble a black box. Therefore, it is necessary to identify the difference between automatically learned features by deep IR models and hand-crafted features used in traditional learning to rank approaches. Furthermore, it is valuable to investigate the differences between these deep IR models. This paper aims to conduct a deep investigation on deep IR models. Specifically, we conduct an extensive empirical study on two different datasets, including Robust and LETOR4.0. We first compared the automatically learned features and hand-crafted features on the respects of query term coverage, document length, embeddings and robustness. It reveals a number of disadvantages compared with hand-crafted features. Therefore, we establish guidelines for improving existing deep IR models. Furthermore, we compare two different categories of deep IR models, i.e. representation-focused models and interaction-focused models. It is shown that two types of deep IR models focus on different categories of words, including topic-related words and query-related words.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes