LG MLFeb 16, 2018

Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis

Pengtao Xie, Wei Wu, Yichen Zhu, Eric P. Xing

arXiv:1802.06014v111.131 citations

Originality Incremental advance

AI Analysis

This work addresses theoretical and computational issues in distance metric learning for machine learning applications, though it is incremental as it builds on existing orthogonality-promoting methods.

The paper tackles the non-convex optimization and lack of theoretical understanding in orthogonality-promoting regularization for distance metric learning by proposing convex relaxations and formal analyses, achieving improved balancedness, compactness, and generalization with computational efficiency in experiments.

Distance metric learning (DML), which learns a distance metric from labeled "similar" and "dissimilar" data pairs, is widely utilized. Recently, several works investigate orthogonality-promoting regularization (OPR), which encourages the projection vectors in DML to be close to being orthogonal, to achieve three effects: (1) high balancedness -- achieving comparable performance on both frequent and infrequent classes; (2) high compactness -- using a small number of projection vectors to achieve a "good" metric; (3) good generalizability -- alleviating overfitting to training data. While showing promising results, these approaches suffer three problems. First, they involve solving non-convex optimization problems where achieving the global optimal is NP-hard. Second, it lacks a theoretical understanding why OPR can lead to balancedness. Third, the current generalization error analysis of OPR is not directly on the regularizer. In this paper, we address these three issues by (1) seeking convex relaxations of the original nonconvex problems so that the global optimal is guaranteed to be achievable; (2) providing a formal analysis on OPR's capability of promoting balancedness; (3) providing a theoretical analysis that directly reveals the relationship between OPR and generalization performance. Experiments on various datasets demonstrate that our convex methods are more effective in promoting balancedness, compactness, and generalization, and are computationally more efficient, compared with the nonconvex methods.

View on arXiv PDF

Similar