Unbiasing on the Fly: Explanation-Guided Human Oversight of Machine Learning System Decisions
This addresses fairness concerns in critical domains like hiring and healthcare, offering a novel approach to mitigate discrimination in deployed systems, though it is incremental in combining existing techniques.
The paper tackles the problem of discriminatory decision-making in deployed ML systems by proposing a framework for real-time monitoring and correction using counterfactual explanations and human oversight, enabling fairer operations in dynamic settings.
The widespread adoption of ML systems across critical domains like hiring, finance, and healthcare raises growing concerns about their potential for discriminatory decision-making based on protected attributes. While efforts to ensure fairness during development are crucial, they leave deployed ML systems vulnerable to potentially exhibiting discrimination during their operations. To address this gap, we propose a novel framework for on-the-fly tracking and correction of discrimination in deployed ML systems. Leveraging counterfactual explanations, the framework continuously monitors the predictions made by an ML system and flags discriminatory outcomes. When flagged, post-hoc explanations related to the original prediction and the counterfactual alternatives are presented to a human reviewer for real-time intervention. This human-in-the-loop approach empowers reviewers to accept or override the ML system decision, enabling fair and responsible ML operation under dynamic settings. While further work is needed for validation and refinement, this framework offers a promising avenue for mitigating discrimination and building trust in ML systems deployed in a wide range of domains.