IRAICLApr 14

ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation

arXiv:2605.1876965.9
AI Analysis

For personalized RAG systems, ClusterRAG addresses the high retrieval cost and lack of collaborative filtering, offering a practical and effective solution.

ClusterRAG improves personalized retrieval-augmented generation by incorporating collaborative signals from similar users, achieving state-of-the-art performance on the LaMP benchmark across diverse tasks.

Personalized Retrieval-Augmented Generation (RAG) relies on accurately selecting user-relevant documents. In practice, existing RAG approaches often suffer from high retrieval costs and overlook that collaborative signals from similar users can enhance personalized generation for the current user. We propose ClusterRAG, a Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation. ClusterRAG represents users through their profile documents, organizes users into semantically coherent clusters using density-based clustering, and performs retrieval at both the cluster and document levels via cluster-level similarity and fine-grained ranking. Extensive experiments on the LaMP benchmark demonstrate that jointly leveraging the target user's profile and profiles from top similar users consistently yields the best performance across diverse tasks. Further analysis shows that ClusterRAG integrates seamlessly with different dense retrievers and rankers, and remains effective when paired with both fine-tuned and zero-shot language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes