BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain
This addresses performance issues in e-commerce RAG systems, where long-tail entities and frequent updates are common, but it is incremental as it builds on existing RAG and LoRA methods.
The paper tackles the suboptimal performance of separate retrieval and generation modules in RAG systems for e-commerce by proposing BSharedRAG, which uses a shared backbone model with LoRA modules, resulting in improvements of 5-13% in retrieval and 23% in generation metrics.
Retrieval Augmented Generation (RAG) system is important in domains such as e-commerce, which has many long-tail entities and frequently updated information. Most existing works adopt separate modules for retrieval and generation, which may be suboptimal since the retrieval task and the generation task cannot benefit from each other to improve performance. We propose a novel Backbone Shared RAG framework (BSharedRAG). It first uses a domain-specific corpus to continually pre-train a base model as a domain-specific backbone model and then trains two plug-and-play Low-Rank Adaptation (LoRA) modules based on the shared backbone to minimize retrieval and generation losses respectively. Experimental results indicate that our proposed BSharedRAG outperforms baseline models by 5% and 13% in Hit@3 upon two datasets in retrieval evaluation and by 23% in terms of BLEU-3 in generation evaluation. Our codes, models, and dataset are available at https://bsharedrag.github.io.