ESCAPE: Countering Systematic Errors from Machine's Blind Spots via Interactive Visual Analysis
This addresses the problem of biased AI predictions for practitioners by providing an interactive tool to identify and reduce systematic errors, though it is incremental as it builds on existing human-in-the-loop and debiasing approaches.
The authors tackled systematic errors in classification models, known as AI blindspots, by developing ESCAPE, a visual analytic system that enables human-in-the-loop inspection and mitigation of spurious associations, resulting in improved model performance through quantitative experiments and user evaluations.
Classification models learn to generalize the associations between data samples and their target classes. However, researchers have increasingly observed that machine learning practice easily leads to systematic errors in AI applications, a phenomenon referred to as AI blindspots. Such blindspots arise when a model is trained with training samples (e.g., cat/dog classification) where important patterns (e.g., black cats) are missing or periphery/undesirable patterns (e.g., dogs with grass background) are misleading towards a certain class. Even more sophisticated techniques cannot guarantee to capture, reason about, and prevent the spurious associations. In this work, we propose ESCAPE, a visual analytic system that promotes a human-in-the-loop workflow for countering systematic errors. By allowing human users to easily inspect spurious associations, the system facilitates users to spontaneously recognize concepts associated misclassifications and evaluate mitigation strategies that can reduce biased associations. We also propose two statistical approaches, relative concept association to better quantify the associations between a concept and instances, and debias method to mitigate spurious associations. We demonstrate the utility of our proposed ESCAPE system and statistical measures through extensive evaluation including quantitative experiments, usage scenarios, expert interviews, and controlled user experiments.