Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning
This addresses the need for transparency and accountability in decision-making for domains like healthcare, though it is incremental as it builds on existing policy learning methods by adding interpretability and handling partial observability.
The paper tackles the problem of understanding human decision-making from observed data in settings like healthcare, where underlying states and dynamics are unknown and live experimentation is not allowed, by proposing a model-based Bayesian method for interpretable policy learning that jointly estimates belief-update processes and belief-action mappings, demonstrating its potential on simulated and real-world Alzheimer's disease diagnosis data.
Understanding human behavior from observed data is critical for transparency and accountability in decision-making. Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging -- with no access to underlying states, no knowledge of environment dynamics, and no allowance for live experimentation. We desire learning a data-driven representation of decision-making behavior that (1) inheres transparency by design, (2) accommodates partial observability, and (3) operates completely offline. To satisfy these key criteria, we propose a novel model-based Bayesian method for interpretable policy learning ("Interpole") that jointly estimates an agent's (possibly biased) belief-update process together with their (possibly suboptimal) belief-action mapping. Through experiments on both simulated and real-world data for the problem of Alzheimer's disease diagnosis, we illustrate the potential of our approach as an investigative device for auditing, quantifying, and understanding human decision-making behavior.