CLAISep 10, 2025

Low-Resource Fine-Tuning for Multi-Task Structured Information Extraction with a Billion-Parameter Instruction-Tuned Model

arXiv:2509.08381v12.7
Originality Incremental advance
AI Analysis

This work addresses the challenge of high computational costs and data scarcity for smaller teams in domains like finance and legal analytics, though it is incremental as it applies existing fine-tuning methods to a smaller model.

The paper tackled the problem of deploying large language models for structured information extraction in resource-constrained settings by fine-tuning a billion-parameter model with low-rank adaptation on small datasets. The result showed that ETLCH outperformed strong baselines across most metrics, with substantial gains at low data scales, enabling cost-effective extraction pipelines.

Deploying large language models (LLMs) for structured data extraction in domains such as financial compliance reporting, legal document analytics, and multilingual knowledge base construction is often impractical for smaller teams due to the high cost of running large architectures and the difficulty of preparing large, high-quality datasets. Most recent instruction-tuning studies focus on seven-billion-parameter or larger models, leaving limited evidence on whether much smaller models can work reliably under low-resource, multi-task conditions. This work presents ETLCH, a billion-parameter LLaMA-based model fine-tuned with low-rank adaptation on only a few hundred to one thousand samples per task for JSON extraction, knowledge graph extraction, and named entity recognition. Despite its small scale, ETLCH outperforms strong baselines across most evaluation metrics, with substantial gains observed even at the lowest data scale. These findings demonstrate that well-tuned small models can deliver stable and accurate structured outputs at a fraction of the computational cost, enabling cost-effective and reliable information extraction pipelines in resource-constrained environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes