LGMar 10

Proxy-Guided Measurement Calibration

arXiv:2603.09288v253.11 citationsh-index: 5
AI Analysis

This addresses measurement error issues in survey and administrative data for researchers and policymakers, though it is an incremental improvement over existing methods.

The paper tackles the problem of systematic measurement error in aggregate outcome variables, such as disaster loss databases, by proposing a proxy-guided framework to estimate and correct these errors, achieving improved calibration in synthetic and real-world evaluations.

Aggregate outcome variables collected through surveys and administrative records are often subject to systematic measurement error. For instance, in disaster loss databases, county-level losses reported may differ from the true damages due to variations in on-the-ground data collection capacity, reporting practices, and event characteristics. Such miscalibration complicates downstream analysis and decision-making. We study the problem of outcome miscalibration and propose a framework guided by proxy variables for estimating and correcting the systematic errors. We model the data-generating process using a causal graph that separates latent content variables driving the true outcome from the latent bias variables that induce systematic errors. The key insight is that proxy variables that depend on the true outcome but are independent of the bias mechanism provide identifying information for quantifying the bias. Leveraging this structure, we introduce a two-stage approach that utilizes variational autoencoders to disentangle content and bias latents, enabling us to estimate the effect of bias on the outcome of interest. We analyze the assumptions underlying our approach and evaluate it on synthetic data, semi-synthetic datasets derived from randomized trials, and a real-world case study of disaster loss reporting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes