Secure Multifaceted-RAG for Enterprise: Hybrid Knowledge Retrieval with Security Filtering
This provides a practical and secure solution for enterprise RAG systems, addressing completeness and data leakage concerns in specific domains like automotive.
The paper tackles the problem of limited retrieval scope and data security risks in enterprise RAG systems by proposing the SecMulti-RAG framework, which retrieves from internal documents, pre-generated expert knowledge, and external LLM-generated knowledge with security filtering. The result shows significant improvements over traditional RAG, achieving 79.3-91.9% win rates in LLM-based evaluation and 56.3-70.4% in human evaluation on an automotive report generation task.
Existing Retrieval-Augmented Generation (RAG) systems face challenges in enterprise settings due to limited retrieval scope and data security risks. When relevant internal documents are unavailable, the system struggles to generate accurate and complete responses. Additionally, using closed-source Large Language Models (LLMs) raises concerns about exposing proprietary information. To address these issues, we propose the Secure Multifaceted-RAG (SecMulti-RAG) framework, which retrieves not only from internal documents but also from two supplementary sources: pre-generated expert knowledge for anticipated queries and on-demand external LLM-generated knowledge. To mitigate security risks, we adopt a local open-source generator and selectively utilize external LLMs only when prompts are deemed safe by a filtering mechanism. This approach enhances completeness, prevents data leakage, and reduces costs. In our evaluation on a report generation task in the automotive industry, SecMulti-RAG significantly outperforms traditional RAG - achieving 79.3 to 91.9 percent win rates across correctness, richness, and helpfulness in LLM-based evaluation, and 56.3 to 70.4 percent in human evaluation. This highlights SecMulti-RAG as a practical and secure solution for enterprise RAG.