LGMLOct 11, 2023

Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions

arXiv:2310.07174v25 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses sorting challenges for complex data types in machine learning, representing an incremental improvement over existing neural sorting methods.

The paper tackled the problem of sorting high-dimensional, abstract inputs like images using neural networks by developing an error-free differentiable swap function to ensure differentiability and a permutation-equivariant Transformer to capture dependencies, achieving performance better than or comparable to baselines on diverse benchmarks.

Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic. Beyond the problem formulation of traditional sorting algorithms, we consider sorting problems for more abstract yet expressive inputs, e.g., multi-digit images and image fragments, through a neural sorting network. To learn a mapping from a high-dimensional input to an ordinal variable, the differentiability of sorting networks needs to be guaranteed. In this paper we define a softening error by a differentiable swap function, and develop an error-free swap function that holds a non-decreasing condition and differentiability. Furthermore, a permutation-equivariant Transformer network with multi-head attention is adopted to capture dependency between given inputs and also leverage its model capacity with self-attention. Experiments on diverse sorting benchmarks show that our methods perform better than or comparable to baseline methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes