DSJun 4

Workload-Aware Autotuning of Block Size in Square-Root Decomposition

arXiv:2606.061450.6

Predicted impact top 51% in DS · last 90 daysOriginality Synthesis-oriented

AI Analysis

For algorithm engineers and practitioners using square-root decomposition, this work demonstrates that workload-aware autotuning can yield consistent performance gains over the textbook choice.

The paper shows that learned workload models can autotune block size in square-root decomposition, reducing mean regret from 1.2882 to 1.0646 and achieving a 1.151x geometric-mean speedup over the fixed sqrt(n) rule.

The textbook choice B=sqrt(n) for square-root decomposition is asymptotically natural, but it is not always the fastest implementation choice. We study block-size autotuning as a reproducible algorithm-engineering problem and show that a learned workload model can improve over fixed sqrt(n) on the tested implementation. Under repeated grouped cross-validation, the best policy is a full-feature KNN-9 model that reduces mean regret from 1.2882 to 1.0646 and yields a paired geometric-mean speedup of 1.151x. A confidence gate retains most of that gain while reducing slowdowns. A family-free full-observation follow-up remains better than fixed blocking, which suggests that the model is learning from workload statistics rather than memorizing labels. In contrast, short-prefix variants do not produce a successful low-overhead online tuner in the current prototype. External validation is selective but supportive: Zipf-Hotspot is the strongest out-of-distribution case, and a six-window Baleen follow-up still improves over fixed blocking. Overall, block-size choice is workload aware and platform aware, and the fixed sqrt(n) rule leaves substantial performance on the table.

View on arXiv PDF

Similar