CL AINov 14, 2025

Scaling Open-Weight Large Language Models for Hydropower Regulatory Information Extraction: A Systematic Analysis

Hong-Jun Yoon, Faisal Ashraf, Thomas A. Ruggles, Debjani Singh

arXiv:2511.11821v12.7h-index: 1

Originality Incremental advance

AI Analysis

This provides immediate value for hydropower compliance by enabling evidence-based model selection, while offering generalizable insights into parameter scaling for information extraction tasks.

The study tackled the problem of extracting information from hydropower regulatory documents using open-weight large language models, finding that a 14B parameter threshold is critical for viable performance (F1=0.64), with smaller models plateauing at 51% F1 and large-scale models reaching up to 77% F1.

Information extraction from regulatory documents using large language models presents critical trade-offs between performance and computational resources. We evaluated seven open-weight models (0.6B-70B parameters) on hydropower licensing documentation to provide empirical deployment guidance. Our analysis identified a pronounced 14B parameter threshold where validation methods transition from ineffective (F1 $<$ 0.15) to viable (F1 = 0.64). Consumer-deployable models achieve 64\% F1 through appropriate validation, while smaller models plateau at 51\%. Large-scale models approach 77\% F1 but require enterprise infrastructure. We identified systematic hallucination patterns where perfect recall indicates extraction failure rather than success in smaller models. Our findings establish the first comprehensive resource-performance mapping for open-weight information extraction in regulatory contexts, enabling evidence-based model selection. These results provide immediate value for hydropower compliance while contributing insights into parameter scaling effects that generalize across information extraction tasks.

View on arXiv PDF

Similar