Interpretable Image Recognition with Hierarchical Prototypes
This work addresses the need for interpretable AI in vision tasks, particularly for users requiring transparent decision-making, though it is incremental by building on existing prototype-based methods.
The paper tackles the problem of interpretable image recognition by introducing a model that uses hierarchically organized prototypes to classify objects at every level of a predefined taxonomy, enabling distinct explanations for predictions at each level and interpretable classification of unseen classes. The model performs approximately as well as a black-box counterpart on tasks involving familiar and unseen classes from a subset of ImageNet.
Vision models are interpretable when they classify objects on the basis of features that a person can directly understand. Recently, methods relying on visual feature prototypes have been developed for this purpose. However, in contrast to how humans categorize objects, these approaches have not yet made use of any taxonomical organization of class labels. With such an approach, for instance, we may see why a chimpanzee is classified as a chimpanzee, but not why it was considered to be a primate or even an animal. In this work we introduce a model that uses hierarchically organized prototypes to classify objects at every level in a predefined taxonomy. Hence, we may find distinct explanations for the prediction an image receives at each level of the taxonomy. The hierarchical prototypes enable the model to perform another important task: interpretably classifying images from previously unseen classes at the level of the taxonomy to which they correctly relate, e.g. classifying a hand gun as a weapon, when the only weapons in the training data are rifles. With a subset of ImageNet, we test our model against its counterpart black-box model on two tasks: 1) classification of data from familiar classes, and 2) classification of data from previously unseen classes at the appropriate level in the taxonomy. We find that our model performs approximately as well as its counterpart black-box model while allowing for each classification to be interpreted.