LG AIApr 15, 2021

Sparse online relative similarity learning

Dezhong Yao, Peilin Zhao, Chen Yu, Hai Jin, Bin Li

arXiv:2104.07501v15.56 citationsh-index: 79

Originality Incremental advance

AI Analysis

This work addresses scalability issues in similarity learning for high-dimensional data mining and machine learning tasks, though it appears incremental as it builds on existing online methods.

The authors tackled the inefficiency of online similarity learning in high-dimensional tasks by introducing Sparse Online Relative Similarity (SORS) algorithms, which learn a sparse model to reduce memory and computational costs, achieving encouraging empirical results on real-world datasets.

For many data mining and machine learning tasks, the quality of a similarity measure is the key for their performance. To automatically find a good similarity measure from datasets, metric learning and similarity learning are proposed and studied extensively. Metric learning will learn a Mahalanobis distance based on positive semi-definite (PSD) matrix, to measure the distances between objectives, while similarity learning aims to directly learn a similarity function without PSD constraint so that it is more attractive. Most of the existing similarity learning algorithms are online similarity learning method, since online learning is more scalable than offline learning. However, most existing online similarity learning algorithms learn a full matrix with d 2 parameters, where d is the dimension of the instances. This is clearly inefficient for high dimensional tasks due to its high memory and computational complexity. To solve this issue, we introduce several Sparse Online Relative Similarity (SORS) learning algorithms, which learn a sparse model during the learning process, so that the memory and computational cost can be significantly reduced. We theoretically analyze the proposed algorithms, and evaluate them on some real-world high dimensional datasets. Encouraging empirical results demonstrate the advantages of our approach in terms of efficiency and efficacy.

View on arXiv PDF

Similar