LGOct 8, 2020

Successive Halving Top-k Operator

arXiv:2010.15552v16 citations
Originality Incremental advance
AI Analysis

This addresses a computational bottleneck in machine learning tasks requiring differentiable top-k selection, though it appears incremental as it builds on existing relaxation techniques.

The paper tackled the problem of making the top-k operator differentiable for gradient-based optimization by introducing a successive halving method that avoids computing softmax over the entire score vector, achieving a better approximation with lower computational cost compared to prior methods.

We propose a differentiable successive halving method of relaxing the top-k operator, rendering gradient-based optimization possible. The need to perform softmax iteratively on the entire vector of scores is avoided by using a tournament-style selection. As a result, a much better approximation of top-k with lower computational cost is achieved compared to the previous approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes