ComRAG: Retrieval-Augmented Generation with Dynamic Vector Stores for Real-time Community Question Answering in Industry
This addresses the challenge of real-time knowledge utilization in industrial CQA platforms, offering an incremental improvement with specific performance gains.
The paper tackled the problem of effectively leveraging historical interactions and domain knowledge in real-time for industrial Community Question Answering (CQA) platforms, proposing ComRAG, a retrieval-augmented generation framework that achieved up to 25.9% improvement in vector similarity, reduced latency by 8.7% to 23.3%, and lowered chunk growth from 20.23% to 2.06% over iterations.
Community Question Answering (CQA) platforms can be deemed as important knowledge bases in community, but effectively leveraging historical interactions and domain knowledge in real-time remains a challenge. Existing methods often underutilize external knowledge, fail to incorporate dynamic historical QA context, or lack memory mechanisms suited for industrial deployment. We propose ComRAG, a retrieval-augmented generation framework for real-time industrial CQA that integrates static knowledge with dynamic historical QA pairs via a centroid-based memory mechanism designed for retrieval, generation, and efficient storage. Evaluated on three industrial CQA datasets, ComRAG consistently outperforms all baselines--achieving up to 25.9% improvement in vector similarity, reducing latency by 8.7% to 23.3%, and lowering chunk growth from 20.23% to 2.06% over iterations.