Reducing Redundancy in Retrieval-Augmented Generation through Chunk Filtering
For practitioners building RAG systems, this provides lightweight methods to improve storage and retrieval efficiency without significant performance loss.
This study proposes chunk filtering strategies (semantic, topic-based, named-entity-based) to reduce redundancy in RAG pipelines, achieving 25-36% reduction in vector index size while maintaining retrieval quality close to baseline.
Standard Retrieval-Augmented Generation (RAG) chunking methods often create excessive redundancy, increasing storage costs and slowing retrieval. This study explores chunk filtering strategies, such as semantic, topic-based, and named-entity-based methods in order to reduce the indexed corpus while preserving retrieval quality. Experiments are conducted on multiple corpora. Retrieval performance is evaluated using a token-based framework based on precision, recall, and intersection-over-union metrics. Results indicate that entity-based filtering can reduce vector index size by approximately 25% to 36% while maintaining high retrieval quality close to the baseline. These findings suggest that redundancy introduced during chunking can be effectively reduced through lightweight filtering, improving the efficiency of retrieval-oriented components in RAG pipelines.