OCLGMLJan 20, 2024

Accelerating Sinkhorn Algorithm with Sparse Newton Iterations

arXiv:2401.12253v114 citationsICLR
Originality Incremental advance
AI Analysis

This incremental improvement accelerates optimal transport computations, benefiting machine learning applications that rely on these distances.

The paper tackles the slow convergence of the Sinkhorn algorithm for optimal transport by introducing Sinkhorn-Newton-Sparse (SNS), which combines early stopping with a Newton-type subroutine using a sparse Hessian approximation, achieving orders of magnitude faster convergence in practical cases like Wasserstein distance calculations.

Computing the optimal transport distance between statistical distributions is a fundamental task in machine learning. One remarkable recent advancement is entropic regularization and the Sinkhorn algorithm, which utilizes only matrix scaling and guarantees an approximated solution with near-linear runtime. Despite the success of the Sinkhorn algorithm, its runtime may still be slow due to the potentially large number of iterations needed for convergence. To achieve possibly super-exponential convergence, we present Sinkhorn-Newton-Sparse (SNS), an extension to the Sinkhorn algorithm, by introducing early stopping for the matrix scaling steps and a second stage featuring a Newton-type subroutine. Adopting the variational viewpoint that the Sinkhorn algorithm maximizes a concave Lyapunov potential, we offer the insight that the Hessian matrix of the potential function is approximately sparse. Sparsification of the Hessian results in a fast $O(n^2)$ per-iteration complexity, the same as the Sinkhorn algorithm. In terms of total iteration count, we observe that the SNS algorithm converges orders of magnitude faster across a wide range of practical cases, including optimal transportation between empirical distributions and calculating the Wasserstein $W_1, W_2$ distance of discretized densities. The empirical performance is corroborated by a rigorous bound on the approximate sparsity of the Hessian matrix.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes