FiMI: A Domain-Specific Language Model for Indian Finance Ecosystem

arXiv:2602.05794v2h-index: 6
Originality Highly original
AI Analysis

This model addresses the need for a high-performing financial language model tailored to the Indian finance ecosystem, specifically for NPCI's digital payment systems.

This paper introduces FiMI, a domain-specialized financial language model for Indian digital payment systems. FiMI Base improves finance reasoning by 20% over Mistral Small 24B, and FiMI Instruct outperforms Mistral Small 24B Instruct by 87% on domain-specific tool-calling.

We present FiMI (Finance Model for India), a domain-specialized financial language model developed by National Payments Corporation of India (NPCI) for Indian digital payment systems. We develop two model variants: FiMI Base and FiMI Instruct. FiMI adapts the Mistral Small 24B architecture through a multi-stage training pipeline, beginning with continuous pre-training on 68 Billion tokens of curated financial, multilingual (English, Hindi, Hinglish), and synthetic data. This is followed by instruction fine-tuning and domain-specific supervised fine-tuning focused on multi-turn, tool-driven conversations that model real-world workflows, such as transaction disputes and mandate lifecycle management. Evaluations reveal that FiMI Base achieves a 20\% improvement over the Mistral Small 24B Base model on finance reasoning benchmark, while FiMI Instruct outperforms the Mistral Small 24B Instruct model by 87\% on domain-specific tool-calling. Moreover, FiMI achieves these significant domain gains while maintaining comparable performance to models of similar size on general benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes