AIJun 29, 2025

FinStat2SQL: A Text2SQL Pipeline for Financial Statement Analysis

arXiv:2506.23273v22 citationsh-index: 1
Originality Incremental advance
AI Analysis

It provides a scalable, cost-efficient solution for AI-powered querying of financial data, specifically targeting Vietnamese enterprises with local standards like VAS.

The paper tackles the challenge of text-to-SQL for financial statement analysis by developing FinStat2SQL, a lightweight pipeline that combines large and small language models, achieving 61.33% accuracy with sub-4-second response times on consumer hardware and outperforming GPT-4o-mini.

Despite the advancements of large language models, text2sql still faces many challenges, particularly with complex and domain-specific queries. In finance, database designs and financial reporting layouts vary widely between financial entities and countries, making text2sql even more challenging. We present FinStat2SQL, a lightweight text2sql pipeline enabling natural language queries over financial statements. Tailored to local standards like VAS, it combines large and small language models in a multi-agent setup for entity extraction, SQL generation, and self-correction. We build a domain-specific database and evaluate models on a synthetic QA dataset. A fine-tuned 7B model achieves 61.33\% accuracy with sub-4-second response times on consumer hardware, outperforming GPT-4o-mini. FinStat2SQL offers a scalable, cost-efficient solution for financial analysis, making AI-powered querying accessible to Vietnamese enterprises.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes