IRDBMay 5

Beyond Similarity Search: A Unified Data Layer for Production RAG Systems

arXiv:2605.0327526.01 citations
Predicted impact top 94% in IR · last 90 daysOriginality Synthesis-oriented
AI Analysis

For practitioners deploying RAG systems in production, this work addresses critical reliability and performance issues caused by conventional split-system data layers.

The paper identifies three root causes of the gap between prototype and production RAG systems—data staleness, tenant data leakage, and query composition explosion—and proposes a unified data layer using PostgreSQL with pgvector and HNSW indexing. Benchmarks on 50,000 documents show 92% latency reduction for date-filtered queries, 74% for tenant-scoped queries, zero synchronization inconsistency, and elimination of cross-tenant data leakage with 93% less synchronization code.

Retrieval-Augmented Generation (RAG) systems have become the standard architecture for grounding large language models in organizational knowledge. Yet production deployments consistently expose a gap between clean prototype performance and real-world reliability. This paper identifies three root causes of that gap: data staleness, tenant data leakage, and query composition explosion. All three trace back to the conventional split-system data layer. We propose and evaluate a unified data layer built on PostgreSQL with native vector search (pgvector) and HNSW indexing. Controlled benchmarks on 50,000 documents show 92% latency reduction for date-filtered queries, 74% for tenant-scoped queries, zero synchronization inconsistency, and complete elimination of cross-tenant data leakage with 93% less synchronization code. We additionally discuss a recommended hybrid tier architecture

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes