Interpretable to Whom? A Role-based Model for Analyzing Interpretable Machine Learning Systems
This work addresses the need for clearer interpretability standards for researchers, developers, and regulators, but it is incremental as it builds on existing arguments without introducing new empirical methods or data.
The paper tackles the problem of defining interpretability in machine learning by proposing a role-based model that identifies different agent roles to clarify for whom a system is interpretable, and illustrates its application across various scenarios to influence interpretability goals and definitions.
Several researchers have argued that a machine learning system's interpretability should be defined in relation to a specific agent or task: we should not ask if the system is interpretable, but to whom is it interpretable. We describe a model intended to help answer this question, by identifying different roles that agents can fulfill in relation to the machine learning system. We illustrate the use of our model in a variety of scenarios, exploring how an agent's role influences its goals, and the implications for defining interpretability. Finally, we make suggestions for how our model could be useful to interpretability researchers, system developers, and regulatory bodies auditing machine learning systems.