Efficient Optimization Methods for Extreme Similarity Learning with Nonlinear Embeddings
This work addresses a computational bottleneck for researchers and practitioners using nonlinear embeddings in similarity learning, but it is incremental as it extends prior results from linear to nonlinear cases.
The paper tackles the problem of training similarity learning models with nonlinear embeddings from all possible pairs, which is computationally difficult due to the extreme number of pairs, by providing efficient formulations for optimization building blocks and showing that some methods achieve high efficiency.
We study the problem of learning similarity by using nonlinear embedding models (e.g., neural networks) from all possible pairs. This problem is well-known for its difficulty of training with the extreme number of pairs. For the special case of using linear embeddings, many studies have addressed this issue of handling all pairs by considering certain loss functions and developing efficient optimization algorithms. This paper aims to extend results for general nonlinear embeddings. First, we finish detailed derivations and provide clean formulations for efficiently calculating some building blocks of optimization algorithms such as function, gradient evaluation, and Hessian-vector product. The result enables the use of many optimization methods for extreme similarity learning with nonlinear embeddings. Second, we study some optimization methods in detail. Due to the use of nonlinear embeddings, implementation issues different from linear cases are addressed. In the end, some methods are shown to be highly efficient for extreme similarity learning with nonlinear embeddings.