DSJun 4

Workload-Aware Autotuning of Block Size in Square-Root Decomposition

arXiv:2606.061450.6
Predicted impact top 51% in DS · last 90 daysOriginality Synthesis-oriented
AI Analysis

For algorithm engineers and practitioners using square-root decomposition, this work demonstrates that workload-aware autotuning can yield consistent performance gains over the textbook choice.

The paper shows that learned workload models can autotune block size in square-root decomposition, reducing mean regret from 1.2882 to 1.0646 and achieving a 1.151x geometric-mean speedup over the fixed sqrt(n) rule.

The textbook choice B=sqrt(n) for square-root decomposition is asymptotically natural, but it is not always the fastest implementation choice. We study block-size autotuning as a reproducible algorithm-engineering problem and show that a learned workload model can improve over fixed sqrt(n) on the tested implementation. Under repeated grouped cross-validation, the best policy is a full-feature KNN-9 model that reduces mean regret from 1.2882 to 1.0646 and yields a paired geometric-mean speedup of 1.151x. A confidence gate retains most of that gain while reducing slowdowns. A family-free full-observation follow-up remains better than fixed blocking, which suggests that the model is learning from workload statistics rather than memorizing labels. In contrast, short-prefix variants do not produce a successful low-overhead online tuner in the current prototype. External validation is selective but supportive: Zipf-Hotspot is the strongest out-of-distribution case, and a six-window Baleen follow-up still improves over fixed blocking. Overall, block-size choice is workload aware and platform aware, and the fixed sqrt(n) rule leaves substantial performance on the table.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes