LGJul 17, 2025

GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM

arXiv:2507.13323v11 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses socio-economic estimation for policy-making in developing countries with limited data, representing an incremental improvement through hybrid methods.

The paper tackles the problem of estimating socio-economic indicators like GDP and population in data-scarce regions by introducing GeoReg, a regression model that integrates satellite imagery and geospatial data, leveraging LLM prior knowledge for feature extraction; it outperforms baselines in experiments across three countries, including low-income ones.

Socio-economic indicators like regional GDP, population, and education levels, are crucial to shaping policy decisions and fostering sustainable development. This research introduces GeoReg a regression model that integrates diverse data sources, including satellite imagery and web-based geospatial information, to estimate these indicators even for data-scarce regions such as developing countries. Our approach leverages the prior knowledge of large language model (LLM) to address the scarcity of labeled data, with the LLM functioning as a data engineer by extracting informative features to enable effective estimation in few-shot settings. Specifically, our model obtains contextual relationships between data features and the target indicator, categorizing their correlations as positive, negative, mixed, or irrelevant. These features are then fed into the linear estimator with tailored weight constraints for each category. To capture nonlinear patterns, the model also identifies meaningful feature interactions and integrates them, along with nonlinear transformations. Experiments across three countries at different stages of development demonstrate that our model outperforms baselines in estimating socio-economic indicators, even for low-income countries with limited data availability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes