AILGMLSep 18, 2017

Human Understandable Explanation Extraction for Black-box Classification Models Based on Matrix Factorization

arXiv:1709.06201v13 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for interpretability in AI systems like defect detection or diagnosis services, where understanding decision logic is crucial before deployment, though it is incremental in applying existing techniques to explanation extraction.

The paper tackles the problem of explaining black-box classification models by proposing a method based on matrix factorization to extract human-understandable, rule-like explanations, and validates it on open and industry datasets, showing reasonable results.

In recent years, a number of artificial intelligent services have been developed such as defect detection system or diagnosis system for customer services. Unfortunately, the core in these services is a black-box in which human cannot understand the underlying decision making logic, even though the inspection of the logic is crucial before launching a commercial service. Our goal in this paper is to propose an analytic method of a model explanation that is applicable to general classification models. To this end, we introduce the concept of a contribution matrix and an explanation embedding in a constraint space by using a matrix factorization. We extract a rule-like model explanation from the contribution matrix with the help of the nonnegative matrix factorization. To validate our method, the experiment results provide with open datasets as well as an industry dataset of a LTE network diagnosis and the results show our method extracts reasonable explanations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes