Performance is not enough: the story told by a Rashomon quartet
This is an incremental contribution highlighting the need for visualization in model interpretation for researchers and practitioners in machine learning.
The paper tackles the problem that models with similar predictive performance can provide different explanations of data relationships, introducing a synthetic 'Rashomon Quartet' of four models with identical performance but distinct visual explanations to encourage visualization methods for model comparison.
The usual goal of supervised learning is to find the best model, the one that optimizes a particular performance measure. However, what if the explanation provided by this model is completely different from another model and different again from another model despite all having similarly good fit statistics? Is it possible that the equally effective models put the spotlight on different relationships in the data? Inspired by Anscombe's quartet, this paper introduces a Rashomon Quartet, i.e. a set of four models built on a synthetic dataset which have practically identical predictive performance. However, the visual exploration reveals distinct explanations of the relations in the data. This illustrative example aims to encourage the use of methods for model visualization to compare predictive models beyond their performance.