LG AI EMDec 13, 2022

Policy learning for many outcomes of interest: Combining optimal policy trees with multi-objective Bayesian optimisation

arXiv:2212.06312v22 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses a realistic challenge in policy-making where multiple outcomes must be balanced, though it is incremental as it builds on existing methods like optimal trees and Bayesian optimization.

The paper tackles the problem of learning optimal policies when decision-makers need to trade off multiple outcomes, not just maximize a single one, by proposing Multi-Objective Policy Learning (MOPoL), which combines optimal decision trees with multi-objective Bayesian optimization to explore trade-offs, and applies it to a case study on anti-malarial medication rationing in Kenya, showing that a low-cost greedy tree can accurately proxy a costly optimal tree for decision-making.

Methods for learning optimal policies use causal machine learning models to create human-interpretable rules for making choices around the allocation of different policy interventions. However, in realistic policy-making contexts, decision-makers often care about trade-offs between outcomes, not just single-mindedly maximising utility for one outcome. This paper proposes an approach termed Multi-Objective Policy Learning (MOPoL) which combines optimal decision trees for policy learning with a multi-objective Bayesian optimisation approach to explore the trade-off between multiple outcomes. It does this by building a Pareto frontier of non-dominated models for different hyperparameter settings which govern outcome weighting. The key here is that a low-cost greedy tree can be an accurate proxy for the very computationally costly optimal tree for the purposes of making decisions which means models can be repeatedly fit to learn a Pareto frontier. The method is applied to a real-world case-study of non-price rationing of anti-malarial medication in Kenya.

View on arXiv PDF

Similar