LG AP MLJun 3, 2020

Learning Robust Decision Policies from Observational Data

Muhammad Osama, Dave Zachariah, Peter Stoica

arXiv:2006.02355v13.36 citations

Originality Incremental advance

AI Analysis

This addresses the need for reliable policy learning in high-stakes domains, though it appears incremental as it builds on existing conformal prediction techniques.

The paper tackles the problem of learning robust decision policies from observational data, particularly in safety-critical applications like medical decision support, by developing a method that reduces tails of the cost distribution and provides statistically valid bounds on decision costs, with performance validated on real and synthetic data.

We address the problem of learning a decision policy from observational data of past decisions in contexts with features and associated outcomes. The past policy maybe unknown and in safety-critical applications, such as medical decision support, it is of interest to learn robust policies that reduce the risk of outcomes with high costs. In this paper, we develop a method for learning policies that reduce tails of the cost distribution at a specified level and, moreover, provide a statistically valid bound on the cost of each decision. These properties are valid under finite samples -- even in scenarios with uneven or no overlap between features for different decisions in the observed data -- by building on recent results in conformal prediction. The performance and statistical properties of the proposed method are illustrated using both real and synthetic data.

View on arXiv PDF

Similar