CLCELGDec 29, 2025

Integrating Domain Knowledge for Financial QA: A Multi-Retriever RAG Approach with LLMs

arXiv:2512.23848v11 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses financial QA errors for finance professionals, though it's incremental with domain-specific adaptations.

This research tackles financial numerical reasoning QA errors by implementing a multi-retriever RAG system with LLMs, achieving state-of-the-art performance with >7% improvement over baseline and confirming enhanced numerical reasoning in latest LLMs.

This research project addresses the errors of financial numerical reasoning Question Answering (QA) tasks due to the lack of domain knowledge in finance. Despite recent advances in Large Language Models (LLMs), financial numerical questions remain challenging because they require specific domain knowledge in finance and complex multi-step numeric reasoning. We implement a multi-retriever Retrieval Augmented Generators (RAG) system to retrieve both external domain knowledge and internal question contexts, and utilize the latest LLM to tackle these tasks. Through comprehensive ablation experiments and error analysis, we find that domain-specific training with the SecBERT encoder significantly contributes to our best neural symbolic model surpassing the FinQA paper's top model, which serves as our baseline. This suggests the potential superior performance of domain-specific training. Furthermore, our best prompt-based LLM generator achieves the state-of-the-art (SOTA) performance with significant improvement (>7%), yet it is still below the human expert performance. This study highlights the trade-off between hallucinations loss and external knowledge gains in smaller models and few-shot examples. For larger models, the gains from external facts typically outweigh the hallucination loss. Finally, our findings confirm the enhanced numerical reasoning capabilities of the latest LLM, optimized for few-shot learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes