IR AISep 2, 2025

HF-RAG: Hierarchical Fusion-based RAG with Multiple Sources and Rankers

Payel Santra, Madhusudan Ghosh, Debasis Ganguly, Partha Basuchowdhuri, Sudip Kumar Naskar

arXiv:2509.02837v13.62 citationsh-index: 24CIKM

Originality Incremental advance

AI Analysis

This work addresses the problem of effectively integrating heterogeneous data sources in RAG for fact verification, representing an incremental advancement.

The paper tackled the challenge of combining labeled and unlabeled data in retrieval augmented generation (RAG) by proposing a hierarchical fusion method that aggregates multiple rankers and standardizes scores, resulting in consistent improvements over individual rankers or sources and better out-of-domain generalization on fact verification tasks.

Leveraging both labeled (input-output associations) and unlabeled data (wider contextual grounding) may provide complementary benefits in retrieval augmented generation (RAG). However, effectively combining evidence from these heterogeneous sources is challenging as the respective similarity scores are not inter-comparable. Additionally, aggregating beliefs from the outputs of multiple rankers can improve the effectiveness of RAG. Our proposed method first aggregates the top-documents from a number of IR models using a standard rank fusion technique for each source (labeled and unlabeled). Next, we standardize the retrieval score distributions within each source by applying z-score transformation before merging the top-retrieved documents from the two sources. We evaluate our approach on the fact verification task, demonstrating that it consistently improves over the best-performing individual ranker or source and also shows better out-of-domain generalization.

View on arXiv PDF

Similar