Super-sparse Learning in Similarity Spaces
This addresses the problem of high computational demands in similarity-based applications, offering a more efficient and interpretable solution, though it is incremental as it builds on existing prototype reduction methods.
The paper tackles the computational inefficiency of machine learning in similarity spaces by jointly learning a classification function and an optimal set of virtual prototypes, reducing test-time complexity by up to ten times for methods like SVMs, LASSO, and ridge regression with minimal accuracy loss.
In several applications, input samples are more naturally represented in terms of similarities between each other, rather than in terms of feature vectors. In these settings, machine-learning algorithms can become very computationally demanding, as they may require matching the test samples against a very large set of reference prototypes. To mitigate this issue, different approaches have been developed to reduce the number of required reference prototypes. Current reduction approaches select a small subset of representative prototypes in the space induced by the similarity measure, and then separately train the classification function on the reduced subset. However, decoupling these two steps may not allow reducing the number of prototypes effectively without compromising accuracy. We overcome this limitation by jointly learning the classification function along with an optimal set of virtual prototypes, whose number can be either fixed a priori or optimized according to application-specific criteria. Creating a super-sparse set of virtual prototypes provides much sparser solutions, drastically reducing complexity at test time, at the expense of a slightly increased complexity during training. A much smaller set of prototypes also results in easier-to-interpret decisions. We empirically show that our approach can reduce up to ten times the complexity of Support Vector Machines, LASSO and ridge regression at test time, without almost affecting their classification accuracy.