DLAICLJun 27, 2025

The Attribution Crisis in LLM Search Results

arXiv:2508.00838v12 citationsh-index: 3
Originality Incremental advance
AI Analysis

This highlights a transparency problem for users and developers of web-enabled LLMs, as it reveals systematic exploitation patterns in citation practices, though it is incremental in documenting rather than solving the issue.

The study analyzed 14,000 real-world LLM conversation logs to document an 'attribution gap' where LLMs like Google Gemini and OpenAI GPT-4o often answer queries without citing the web pages they consume, with Gemini leaving about 3 relevant websites uncited per query on average.

Web-enabled LLMs frequently answer queries without crediting the web pages they consume, creating an "attribution gap" - the difference between relevant URLs read and those actually cited. Drawing on approximately 14,000 real-world LMArena conversation logs with search-enabled LLM systems, we document three exploitation patterns: 1) No Search: 34% of Google Gemini and 24% of OpenAI GPT-4o responses are generated without explicitly fetching any online content; 2) No citation: Gemini provides no clickable citation source in 92% of answers; 3) High-volume, low-credit: Perplexity's Sonar visits approximately 10 relevant pages per query but cites only three to four. A negative binomial hurdle model shows that the average query answered by Gemini or Sonar leaves about 3 relevant websites uncited, whereas GPT-4o's tiny uncited gap is best explained by its selective log disclosures rather than by better attribution. Citation efficiency - extra citations provided per additional relevant web page visited - varies widely across models, from 0.19 to 0.45 on identical queries, underscoring that retrieval design, not technical limits, shapes ecosystem impact. We recommend a transparent LLM search architecture based on standardized telemetry and full disclosure of search traces and citation logs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes