Match Them Up: Visually Explainable Few-shot Image Classification
This paper tackles the problem of explainability in few-shot image classification, which is crucial for applying FSL in risk-sensitive domains, by providing visual explanations for the inference process.
This paper addresses the lack of explainability in few-shot learning (FSL) by introducing a method that uses visual representations and an explainable classifier to generate weighted representations with minimal distinguishable features. A discriminator then compares these representations between support and query sets to determine classification results, achieving good accuracy and explainability on three mainstream datasets.
Few-shot learning (FSL) approaches are usually based on an assumption that the pre-trained knowledge can be obtained from base (seen) categories and can be well transferred to novel (unseen) categories. However, there is no guarantee, especially for the latter part. This issue leads to the unknown nature of the inference process in most FSL methods, which hampers its application in some risk-sensitive areas. In this paper, we reveal a new way to perform FSL for image classification, using visual representations from the backbone model and weights generated by a newly-emerged explainable classifier. The weighted representations only include a minimum number of distinguishable features and the visualized weights can serve as an informative hint for the FSL process. Finally, a discriminator will compare the representations of each pair of the images in the support set and the query set. Pairs with the highest scores will decide the classification results. Experimental results prove that the proposed method can achieve both good accuracy and satisfactory explainability on three mainstream datasets.