LapSum -- One Method to Differentiate Them All: Ranking, Sorting and Top-k Selection
This provides a scalable solution for differentiable ordering in machine learning, enabling more efficient training of models that rely on ranking or selection operations.
The paper tackles the problem of making order-type operations like ranking, sorting, and top-k selection differentiable, presenting a method based on the LapSum function that achieves O(n log n) computational complexity. It demonstrates superior performance over state-of-the-art techniques for high-dimensional vectors and large k values.
We present a novel technique for constructing differentiable order-type operations, including soft ranking, soft top-k selection, and soft permutations. Our approach leverages an efficient closed-form formula for the inverse of the function LapSum, defined as the sum of Laplace distributions. This formulation ensures low computational and memory complexity in selecting the highest activations, enabling losses and gradients to be computed in $O(n\log{}n)$ time. Through extensive experiments, we demonstrate that our method outperforms state-of-the-art techniques for high-dimensional vectors and large $k$ values. Furthermore, we provide efficient implementations for both CPU and CUDA environments, underscoring the practicality and scalability of our method for large-scale ranking and differentiable ordering problems.