ML LGApr 3

Transfer Learning for Meta-analysis Under Covariate Shift

arXiv:2604.0265661.6

AI Analysis

This addresses the challenge of making decisions from randomized controlled trials that do not represent target populations, which is crucial for healthcare and policy applications, though it is incremental as it builds on existing transport and meta-analysis methods.

The paper tackles the problem of covariate shift in meta-analysis by proposing a placebo-anchored transport framework that uses source-trial outcomes as proxy signals and target-trial placebo outcomes to calibrate baseline risk, resulting in improved accuracy for heterogeneous treatment effects, especially at small target sample sizes, with experiments showing it is best or near-best across connected settings.

Randomized controlled trials often do not represent the populations where decisions are made, and covariate shift across studies can invalidate standard IPD meta-analysis and transport estimators. We propose a placebo-anchored transport framework that treats source-trial outcomes as abundant proxy signals and target-trial placebo outcomes as scarce, high-fidelity gold labels to calibrate baseline risk. A low-complexity (sparse) correction anchors proxy outcome models to the target population, and the anchored models are embedded in a cross-fitted doubly robust learner, yielding a Neyman-orthogonal, target-site doubly robust estimator for patient-level heterogeneous treatment effects when target treated outcomes are available. We distinguish two regimes: in connected targets (with a treated arm), the method yields target-identified effect estimates; in disconnected targets (placebo-only), it reduces to a principled screen--then--transport procedure under explicit working-model transport assumptions. Experiments on synthetic data and a semi-synthetic IHDP benchmark evaluate pointwise CATE accuracy, ATE error, ranking quality for targeting, decision-theoretic policy regret, and calibration. Across connected settings, the proposed method is best or near-best and improves substantially over proxy-only, target-only, and transport baselines at small target sample sizes; in disconnected settings, it retains strong ranking performance for targeting while pointwise accuracy depends on the strength of the working transport condition.

View on arXiv PDF

Similar