ContextRAG: Extraction-Free Hierarchical Graph Construction for Retrieval-Augmented Generation
For practitioners of retrieval-augmented generation, ContextRAG dramatically reduces the computational cost of graph-based indexing while maintaining competitive performance on multi-hop questions.
ContextRAG introduces a graph RAG system that constructs graph topology without LLM-based entity or relation extraction, using fuzzy concept graphs derived from chunk embeddings. On a 130-task subset, it achieves 33.6% F1 overall and 36.8% F1 on multi-hop tasks, using only 30 LLM calls and 22,073 tokens for indexing, compared to an estimated 23M tokens for a baseline.
Graph-structured retrieval-augmented generation (RAG) systems can improve answer quality on multi-hop questions, but many current systems rely on large language models (LLMs) to extract entities, relations, and summaries during indexing. These calls add token and wall-clock costs that grow with corpus size. We present ContextRAG, a graph RAG system whose graph topology is constructed without LLM-based entity or relation extraction. ContextRAG derives a fuzzy concept graph over chunk embeddings using residual-quantization k-means and Formal Concept Analysis with Lukasiewicz residuated logic. Bridge-like and meet-derived context nodes are induced by soft fuzzy join and meet operations, rather than by LLM-written graph edges. On a 130-task UltraDomain subset, ContextRAG builds its index with 30 LLM calls and 22,073 tokens. In contrast, a local HiRAG reproduction stress test required 870 indexing calls and 3.54M tokens on a 20-task subset before failing during graph construction; linear extrapolation to 130 tasks implies over 23M indexing tokens. ContextRAG obtains 33.6% F1 overall and 36.8% F1 on multi-hop tasks. An activation analysis shows that queries retrieving at least one lattice-derived node in the top five achieve +3.9 percentage points F1 over queries that do not; this association is diagnostic rather than causal.