Active Measurement: Efficient Estimation at Scale
This addresses the need for efficient and reliable data analysis in scientific discovery, though it appears incremental as it builds on existing human-in-the-loop and sampling methods.
The paper tackles the problem of achieving accurate and statistically guaranteed measurements in AI-driven scientific workflows by introducing active measurement, a human-in-the-loop framework that combines AI predictions with importance sampling for human labeling, resulting in reduced estimation error in several tasks.
AI has the potential to transform scientific discovery by analyzing vast datasets with little human effort. However, current workflows often do not provide the accuracy or statistical guarantees that are needed. We introduce active measurement, a human-in-the-loop AI framework for scientific measurement. An AI model is used to predict measurements for individual units, which are then sampled for human labeling using importance sampling. With each new set of human labels, the AI model is improved and an unbiased Monte Carlo estimate of the total measurement is refined. Active measurement can provide precise estimates even with an imperfect AI model, and requires little human effort when the AI model is very accurate. We derive novel estimators, weighting schemes, and confidence intervals, and show that active measurement reduces estimation error compared to alternatives in several measurement tasks.