DSMay 15

DialSort: Non-Comparative Integer Sorting via the Self-Indexing Principle: Architecture, Implementation, and Substrate-Aware Analysis

arXiv:2605.1666723.6
AI Analysis

For practitioners needing fast sorting of bounded-universe integers, DialSort offers a highly efficient alternative to existing methods, though it is domain-specialized and not a universal replacement.

DialSort introduces a non-comparative sorting architecture for bounded-universe integer keys that eliminates the prefix-sum pass by treating the histogram as the canonical ordered representation. On an 8-thread Intel x86-64, it achieves up to 39.77x speedup over std::sort and peak throughput of 115.9 M keys/s, outperforming Classic Counting Sort, IPS4o, and ska_sort in most configurations.

Sorting over bounded-universe integer keys has traditionally relied on counting sort and radix sort, both of which incur mandatory prefix-sum passes, auxiliary scatter buffers, or multiple permutation passes. This paper introduces DialSort, a non-comparative sorting architecture based on the self-indexing principle: each integer key simultaneously encodes its value and its canonical position in the ordered address space [0,U-1]. DialSort eliminates the prefix-sum pass entirely by treating the histogram H as the canonical ordered representation, not as an intermediate structure. To support parallel ingestion without serialization, we introduce the Conflict Resolution Network (CRN), a pipelined additive reduction tree that resolves concurrent writes using equality checks exclusively, with no magnitude comparisons. Formal proofs establish O(n+U) sequential and O(n/k + log k + U) parallel time bounds. A software prototype on an 8-thread Intel x86-64 achieves 39.77x speedup over std::sort and peak throughput of 115.9 M keys/s. Against Classic Counting Sort, DialSort wins 46 of 48 configurations. Against IPS4o, DialSort outperforms it in 24 of 48 sequential and 29 of 48 parallel configurations. Against ska_sort, it wins 46 of 48 configurations. All 208 benchmark configurations passed correctness verification. DialSort is not a universal replacement for comparison-based sorting, but a domain-specialized architecture for bounded-universe workloads where sorting reduces to a geometric read over memory. Benchmark source and five open interactive simulators are released alongside this paper.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes