Guaranteed Classification via Regularized Similarity Learning
This work addresses a theoretical gap in machine learning by connecting similarity learning to classification guarantees, which is incremental as it builds on and improves existing results.
The paper tackles the problem of linking similarity metric learning to classification performance by proposing a regularized similarity learning formulation with general matrix-norms and establishing generalization bounds. It shows that good generalization of the learned similarity function guarantees good classification for the resulting linear classifier, extending and improving prior work to include sparse norms like L1 and mixed (2,1)-norm.
Learning an appropriate (dis)similarity function from the available data is a central problem in machine learning, since the success of many machine learning algorithms critically depends on the choice of a similarity function to compare examples. Despite many approaches for similarity metric learning have been proposed, there is little theoretical study on the links between similarity met- ric learning and the classification performance of the result classifier. In this paper, we propose a regularized similarity learning formulation associated with general matrix-norms, and establish their generalization bounds. We show that the generalization error of the resulting linear separator can be bounded by the derived generalization bound of similarity learning. This shows that a good gen- eralization of the learnt similarity function guarantees a good classification of the resulting linear classifier. Our results extend and improve those obtained by Bellet at al. [3]. Due to the techniques dependent on the notion of uniform stability [6], the bound obtained there holds true only for the Frobenius matrix- norm regularization. Our techniques using the Rademacher complexity [5] and its related Khinchin-type inequality enable us to establish bounds for regularized similarity learning formulations associated with general matrix-norms including sparse L 1 -norm and mixed (2,1)-norm.