Project and Forget: Solving Large-Scale Metric Constrained Problems
This addresses a bottleneck in machine learning for handling large-scale metric constraints, offering a scalable solution that is incremental as it builds on Bregman projections.
The paper tackles the problem of solving metric constrained problems with many inequality constraints, which is key in machine learning but limited by existing methods to specific metrics or small sizes, and demonstrates that their Project and Forget algorithm outperforms state-of-the-art methods in CPU times and problem sizes for tasks like correlation clustering, metric nearness, and metric learning.
Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. Existing methods are restricted to specific kinds of metrics or small problem sizes because of the large number of metric constraints in such problems. In this paper, we provide an active set algorithm, Project and Forget, that uses Bregman projections, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We provide a theoretical analysis of \textsc{Project and Forget} and prove that our algorithm converges to the global optimal solution and that the $L_2$ distance of the current iterate to the optimal solution decays asymptotically at an exponential rate. We demonstrate that using our method we can solve large problem instances of three types of metric constrained problems: general weight correlation clustering, metric nearness, and metric learning; in each case, out-performing the state of the art methods with respect to CPU times and problem sizes.