LGMay 27

When Interpretability Is Unequally Distributed: Fairness in Hybrid Interpretable Models

Ziba Jabbar Zare, Ulrich Aïvodji, Julien Ferry, Thibaut Vidal

arXiv:2605.2862635.9

AI Analysis

For practitioners deploying hybrid interpretable models, the paper highlights a previously overlooked procedural fairness concern and provides a method to audit and mitigate it.

The paper identifies and formalizes Interpretability Coverage Disparity (ICD), a fairness issue in hybrid interpretable models where demographic groups may be unequally routed to interpretable versus black-box components. Experiments across four methods and three datasets show substantial ICD in intermediate transparency regimes, and simple constraints can reduce ICD with minimal accuracy loss, sometimes also improving predictive fairness.

Hybrid interpretable models combine a transparent component with a black-box model by assigning some examples to the former and deferring the rest to the latter. While this design enables flexible tradeoffs between accuracy and interpretability, it also raises a distinct procedural fairness concern: some demographic groups may systematically receive interpretable decisions, while others are disproportionately routed to a black box. We formalize this issue as Interpretability Coverage Disparity (ICD), a demographic-parity-style measure applied to the routing decision of hybrid interpretable models. Using tools from predictive multiplicity, we study ICD across four hybrid interpretable learning methods, three standard fairness benchmark datasets, and multiple sensitive attributes. Our experiments reveal substantial ICD in intermediate transparency regimes, where both the interpretable and black-box components are actively used. We further show that simple coverage-disparity constraints can significantly reduce ICD in exact hybrid learning methods, with marginal impact on accuracy and sparsity. In several settings, ICD mitigation also improves standard algorithmic fairness metrics. These results show that hybrid interpretable models should be audited not only for predictive fairness, but also for how they allocate interpretability across individuals and groups.

View on arXiv PDF

Similar