Learning to Approximate a Bregman Divergence
This work addresses the need for flexible divergence approximations in machine learning, though it is incremental as it builds on existing metric learning frameworks.
The paper tackles the problem of approximating arbitrary Bregman divergences from supervision by developing a method based on piecewise linear approximations of convex generating functions, achieving a generalization error of O_p(m^{-1/2}) that matches known bounds in less general settings and showing competitive empirical performance in clustering and ranking tasks.
Bregman divergences generalize measures such as the squared Euclidean distance and the KL divergence, and arise throughout many areas of machine learning. In this paper, we focus on the problem of approximating an arbitrary Bregman divergence from supervision, and we provide a well-principled approach to analyzing such approximations. We develop a formulation and algorithm for learning arbitrary Bregman divergences based on approximating their underlying convex generating function via a piecewise linear function. We provide theoretical approximation bounds using our parameterization and show that the generalization error $O_p(m^{-1/2})$ for metric learning using our framework matches the known generalization error in the strictly less general Mahalanobis metric learning setting. We further demonstrate empirically that our method performs well in comparison to existing metric learning methods, particularly for clustering and ranking problems.