MLLGMEJun 1, 2024

Combining Experimental and Historical Data for Policy Evaluation

arXiv:2406.00317v16 citations
Originality Incremental advance
AI Analysis

This work addresses policy evaluation challenges for scenarios with multiple data sources, such as in ridesharing, but is incremental as it builds on existing integration methods.

The paper tackles policy evaluation by integrating experimental and historical data, proposing novel methods that linearly combine estimators with optimized weights to minimize mean square error, and demonstrates superior performance in numerical experiments and real-world ridesharing data.

This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to minimize the mean square error (MSE) of the resulting combined estimator. We further apply the pessimistic principle to obtain more robust estimators, and extend these developments to sequential decision making. Theoretically, we establish non-asymptotic error bounds for the MSEs of our proposed estimators, and derive their oracle, efficiency and robustness properties across a broad spectrum of reward shift scenarios. Numerical experiments and real-data-based analyses from a ridesharing company demonstrate the superior performance of the proposed estimators.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes