CLIRJul 30, 2025

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers

arXiv:2507.22337v37 citationsh-index: 19EMNLP
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in NLP and neural retrieval for users needing accurate information from queries with negation, but it is incremental as it builds on existing methods with new datasets and analysis.

The paper tackled the problem of neural models underperforming on queries with negation in information retrieval by introducing a taxonomy of negation and generating benchmark datasets, resulting in faster convergence on the NevIR dataset.

Understanding and solving complex reasoning tasks is vital for addressing the information needs of a user. Although dense neural models learn contextualised embeddings, they still underperform on queries containing negation. To understand this phenomenon, we study negation in both traditional neural information retrieval and LLM-based models. We (1) introduce a taxonomy of negation that derives from philosophical, linguistic, and logical definitions; (2) generate two benchmark datasets that can be used to evaluate the performance of neural information retrieval models and to fine-tune models for a more robust performance on negation; and (3) propose a logic-based classification mechanism that can be used to analyze the performance of retrieval models on existing datasets. Our taxonomy produces a balanced data distribution over negation types, providing a better training setup that leads to faster convergence on the NevIR dataset. Moreover, we propose a classification schema that reveals the coverage of negation types in existing datasets, offering insights into the factors that might affect the generalization of fine-tuned models on negation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes