Patrick Rehill

LG
5papers
18citations
Novelty54%
AI Score27

5 Papers

LGOct 20, 2023
Transparency challenges in policy evaluation with causal machine learning -- improving usability and accountability

Patrick Rehill, Nicholas Biddle

Causal machine learning tools are beginning to see use in real-world policy evaluation tasks to flexibly estimate treatment effects. One issue with these methods is that the machine learning models used are generally black boxes, i.e., there is no globally interpretable way to understand how a model makes estimates. This is a clear problem in policy evaluation applications, particularly in government, because it is difficult to understand whether such models are functioning in ways that are fair, based on the correct interpretation of evidence and transparent enough to allow for accountability if things go wrong. However, there has been little discussion of transparency problems in the causal machine learning literature and how these might be overcome. This paper explores why transparency issues are a problem for causal machine learning in public policy evaluation applications and considers ways these problems might be addressed through explainable AI tools and by simplifying models in line with interpretable AI principles. It then applies these ideas to a case-study using a causal forest model to estimate conditional average treatment effects for a hypothetical change in the school leaving age in Australia. It shows that existing tools for understanding black-box predictive models are poorly suited to causal machine learning and that simplifying the model to make it interpretable leads to an unacceptable increase in error (in this application). It concludes that new tools are needed to properly understand causal machine learning models and the algorithms that fit them.

EMAug 2, 2024
Distilling interpretable causal trees from causal forests

Patrick Rehill

Machine learning methods for estimating treatment effect heterogeneity promise greater flexibility than existing methods that test a few pre-specified hypotheses. However, one problem these methods can have is that it can be challenging to extract insights from complicated machine learning models. A high-dimensional distribution of conditional average treatment effects may give accurate, individual-level estimates, but it can be hard to understand the underlying patterns; hard to know what the implications of the analysis are. This paper proposes the Distilled Causal Tree, a method for distilling a single, interpretable causal tree from a causal forest. This compares well to existing methods of extracting a single tree, particularly in noisy data or high-dimensional data where there are many correlated features. Here it even outperforms the base causal forest in most simulations. Its estimates are doubly robust and asymptotically normal just as those of the causal forest are.

LGMar 21, 2023
Counterfactually Fair Regression with Double Machine Learning

Patrick Rehill

Counterfactual fairness is an approach to AI fairness that tries to make decisions based on the outcomes that an individual with some kind of sensitive status would have had without this status. This paper proposes Double Machine Learning (DML) Fairness which analogises this problem of counterfactual fairness in regression problems to that of estimating counterfactual outcomes in causal inference under the Potential Outcomes framework. It uses arbitrary machine learning methods to partial out the effect of sensitive variables on nonsensitive variables and outcomes. Assuming that the effects of the two sets of variables are additively separable, outcomes will be approximately equalised and individual-level outcomes will be counterfactually fair. This paper demonstrates the approach in a simulation study pertaining to discrimination in workplace hiring and an application on real data estimating the GPAs of law school students. It then discusses when it is appropriate to apply such a method to problems of real-world discrimination where constructs are conceptually complex and finally, whether DML Fairness can achieve justice in these settings.

LGDec 13, 2022
Policy learning for many outcomes of interest: Combining optimal policy trees with multi-objective Bayesian optimisation

Patrick Rehill, Nicholas Biddle

Methods for learning optimal policies use causal machine learning models to create human-interpretable rules for making choices around the allocation of different policy interventions. However, in realistic policy-making contexts, decision-makers often care about trade-offs between outcomes, not just single-mindedly maximising utility for one outcome. This paper proposes an approach termed Multi-Objective Policy Learning (MOPoL) which combines optimal decision trees for policy learning with a multi-objective Bayesian optimisation approach to explore the trade-off between multiple outcomes. It does this by building a Pareto frontier of non-dominated models for different hyperparameter settings which govern outcome weighting. The key here is that a low-cost greedy tree can be an accurate proxy for the very computationally costly optimal tree for the purposes of making decisions which means models can be repeatedly fit to learn a Pareto frontier. The method is applied to a real-world case-study of non-price rationing of anti-malarial medication in Kenya.

EMSep 2, 2023
Fairness Implications of Heterogeneous Treatment Effect Estimation with Machine Learning Methods in Policy-making

Patrick Rehill, Nicholas Biddle

Causal machine learning methods which flexibly generate heterogeneous treatment effect estimates could be very useful tools for governments trying to make and implement policy. However, as the critical artificial intelligence literature has shown, governments must be very careful of unintended consequences when using machine learning models. One way to try and protect against unintended bad outcomes is with AI Fairness methods which seek to create machine learning models where sensitive variables like race or gender do not influence outcomes. In this paper we argue that standard AI Fairness approaches developed for predictive machine learning are not suitable for all causal machine learning applications because causal machine learning generally (at least so far) uses modelling to inform a human who is the ultimate decision-maker while AI Fairness approaches assume a model that is making decisions directly. We define these scenarios as indirect and direct decision-making respectively and suggest that policy-making is best seen as a joint decision where the causal machine learning model usually only has indirect power. We lay out a definition of fairness for this scenario - a model that provides the information a decision-maker needs to accurately make a value judgement about just policy outcomes - and argue that the complexity of causal machine learning models can make this difficult to achieve. The solution here is not traditional AI Fairness adjustments, but careful modelling and awareness of some of the decision-making biases that these methods might encourage which we describe.