CL AI IR LGApr 23, 2024

Retrieval Augmented Generation for Domain-specific Question Answering

Sanat Sharma, David Seunghyun Yoon, Franck Dernoncourt, Dewang Sultania, Karishma Bagga, Mengjiao Zhang, Trung Bui, Varun Kotte

arXiv:2404.14760v210.028 citationsh-index: 41

Originality Synthesis-oriented

AI Analysis

This addresses the issue of general large language models lacking domain-specific understanding for applications like finance or customer service, though it appears incremental as it adapts existing methods to a specific domain.

The authors tackled the problem of domain-specific question answering by building a system for Adobe products, proposing a framework to compile a large QA database and develop retrieval-aware fine-tuning of a large language model, which reduces hallucinations and improves generation.

Question answering (QA) has become an important application in the advanced development of large language models. General pre-trained large language models for question-answering are not trained to properly understand the knowledge or terminology for a specific domain, such as finance, healthcare, education, and customer service for a product. To better cater to domain-specific understanding, we build an in-house question-answering system for Adobe products. We propose a novel framework to compile a large question-answer database and develop the approach for retrieval-aware finetuning of a Large Language model. We showcase that fine-tuning the retriever leads to major improvements in the final generation. Our overall approach reduces hallucinations during generation while keeping in context the latest retrieval information for contextual grounding.

View on arXiv PDF

Similar