AIJun 5, 2022
Sequential Counterfactual Decision-Making Under Confounded Reward
arXiv:2206.02216v1
Originality Synthesis-oriented
AI Analysis
This addresses a methodological challenge in reinforcement learning and causal inference for researchers, but appears incremental as it builds on existing counterfactual frameworks.
The paper tackles the problem of sequential decision-making when rewards are confounded, by formalizing a counterfactual policy-space that incorporates the agent's natural predilection through soft interventions.
We investigate the limitations of random trials when the cause of interest is confounded with the effect by formalizing a counterfactual policy-space where the agent's natural predilection is input to a soft-intervention.