IRCLMay 12, 2023

NevIR: Negation in Neural Information Retrieval

arXiv:2305.07614v2115 citations
Originality Synthesis-oriented
AI Analysis

This addresses a weakness in neural IR for users relying on accurate document retrieval, but it is incremental as it benchmarks existing methods without introducing a new solution.

The paper tackled the problem of how negation impacts neural information retrieval models by constructing a benchmark where models rank documents differing only by negation, finding that most models, including state-of-the-art ones, perform similarly to or worse than random, with cross-encoders performing best but still lagging behind human performance.

Negation is a common everyday phenomena and has been a consistent area of weakness for language models (LMs). Although the Information Retrieval (IR) community has adopted LMs as the backbone of modern IR architectures, there has been little to no research in understanding how negation impacts neural IR. We therefore construct a straightforward benchmark on this theme: asking IR models to rank two documents that differ only by negation. We show that the results vary widely according to the type of IR architecture: cross-encoders perform best, followed by late-interaction models, and in last place are bi-encoder and sparse neural architectures. We find that most information retrieval models (including SOTA ones) do not consider negation, performing the same or worse than a random ranking. We show that although the obvious approach of continued fine-tuning on a dataset of contrastive documents containing negations increases performance (as does model size), there is still a large gap between machine and human performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes