Gabriel Levin-Konigsberg

2papers

2 Papers

46.7AIMay 20
Mind the Sim-to-Real Gap & Think Like a Scientist

Harsh Parikh, Gabriel Levin-Konigsberg, Dominique Perrault-Joncas et al.

Suppose a planner has a pre-trained simulator of a sequential decision problem and the option to run real experiments in the field. The simulator is cheap to query but inherits confounding and drift from its calibration data. Experimentation is unbiased but consumes one real unit per trial. We study when, and how, the planner should supplement the simulator with experiments. We give three results. First, an extended simulation lemma decomposes the simulator's value error into a calibration--deployment shift that randomization can identify and a parametric residual that no further interaction can reduce. Second, the value gap between the simulator-optimal policy and the optimum splits into a local component, on states the deployed policy already visits, and a reachability component, on states it does not. The reachability component stays bounded away from zero at any horizon under purely passive learning. Third, we propose Fisher-SEP, a simulation-aided experimental policy (SEP) that minimizes the posterior predictive variance of a target policy's value, with reward-only and transition-only specializations. Two case studies illustrate the regimes. In a vending-machine supply chain, front-loaded experimentation overtakes posterior updating once the horizon is long enough to amortize the pilot. In an HIV mobile-testing example with a corridor that separates a well-surveilled region from a poorly-surveilled one, only designed exploration reaches the poorly-surveilled region.

MEMar 7
TEA-Time: Transporting Effects Across Time

Harsh Parikh, Gabriel Levin-Konigsberg, Dominique Perrault-Joncas et al.

Treatment effects estimated from randomized controlled trials are local not only to the study population but also to the time at which the trial was conducted. We develop a framework for temporal transportation: extrapolating treatment effects to time periods where no experiment was conducted. We target the transported average treatment effect (TATE) and show that under a separable temporal effects assumption, the TATE decomposes into an observed average treatment effect and a temporal ratio. We provide two identification strategies -- one using replicated trials comparing the same treatments at different times, another using common treatment arms observed across time -- and develop doubly robust, semiparametrically efficient estimators for each. Monte Carlo simulations confirm that both estimators achieve nominal coverage, with the common arm strategy yielding substantial efficiency gains when its stronger assumptions hold. We apply our methods to A/B tests from the Upworthy Research Archive, demonstrating that the two strategies exhibit a variance-bias tradeoff: the common arm approach offers greater precision but may incur bias when treatments interact heterogeneously with temporal factors.