IRCLMay 8

DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models

arXiv:2605.0721059.1Has Code
Predicted impact top 59% in IR · last 90 daysOriginality Incremental advance
AI Analysis

For information retrieval, this work provides an efficient multi-representation retrieval method that outperforms existing approaches, though it builds on known diffusion model capabilities.

DiffRetriever uses diffusion language models to generate multiple representative tokens in a single forward pass, overcoming the inefficiency of autoregressive multi-token retrieval. After fine-tuning, it achieves the strongest BEIR-7 performance among compared retrievers, including PromptReps and RepLLaMA.

PromptReps showed that an autoregressive language model can be used directly as a retriever by prompting it to generate dense and sparse representations of a query or passage. Extending this to multiple representatives is inefficient for autoregressive models, since tokens must be generated sequentially, and prior multi-token variants did not reliably improve over single-token decoding. We show that the bottleneck is sequential generation, not the multi-token idea itself. DiffRetriever is a representative-token retriever for diffusion language models: it appends K masked positions to the prompt and reads all K in a single bidirectional forward pass. Across in-domain and out-of-domain evaluation, multi-token DiffRetriever substantially improves over single-token on every diffusion backbone we test, while autoregressive multi-token is flat or negative and pays a latency cost that scales with K where diffusion does not. After supervised fine-tuning, DiffRetriever on Dream is the strongest BEIR-7 retriever in our comparison, ahead of PromptReps, the encoder-style DiffEmbed baseline on the same diffusion backbones, and the contrastively fine-tuned single-vector RepLLaMA. A per-query oracle on the frozen base model exceeds contrastive fine-tuning at the same fixed budget, pointing to adaptive budget selection as future work. Code is available at https://github.com/ielab/diffretriever.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes