A prototype-based model for set classification
This work addresses set classification problems in computer vision and natural language processing, offering an incremental improvement with enhanced explainability and resource efficiency.
The paper tackles set classification by proposing a prototype-based model on the Grassmann manifold, which learns subspace prototypes and relevance factors for dimensionality selection, resulting in a transparent classifier that demonstrates efficiency compared to transformer-based models in performance, explainability, and computational requirements.
Classification of sets of inputs (e.g., images and texts) is an active area of research within both computer vision (CV) and natural language processing (NLP). A common way to represent a set of vectors is to model them as linear subspaces. In this contribution, we present a prototype-based approach for learning on the manifold formed from such linear subspaces, the Grassmann manifold. Our proposed method learns a set of subspace prototypes capturing the representative characteristics of classes and a set of relevance factors automating the selection of the dimensionality of the subspaces. This leads to a transparent classifier model which presents the computed impact of each input vector on its decision. Through experiments on benchmark image and text datasets, we have demonstrated the efficiency of our proposed classifier, compared to the transformer-based models in terms of not only performance and explainability but also computational resource requirements.