Relational Local Explanations
This addresses the need for more insightful explanations in machine learning models for users dealing with structured data, though it appears incremental as it builds on existing attribution methods by adding relational analysis.
The paper tackles the problem that existing post-hoc explanation methods produce independent feature attributions, ignoring inter-variable relationships in structured data like images and text, by developing a novel model-agnostic, permutation-based relational attribution approach, which experimental evaluations show to be effective and valid compared to state-of-the-art techniques.
The majority of existing post-hoc explanation approaches for machine learning models produce independent, per-variable feature attribution scores, ignoring a critical inherent characteristics of homogeneously structured data, such as visual or text data: there exist latent inter-variable relationships between features. In response, we develop a novel model-agnostic and permutation-based feature attribution approach based on the relational analysis between input variables. As a result, we are able to gain a broader insight into the predictions and decisions of machine learning models. Experimental evaluations of our framework in comparison with state-of-the-art attribution techniques on various setups involving both image and text data modalities demonstrate the effectiveness and validity of our method.