LGDBMEOct 26, 2025

Centrum: Model-based Database Auto-tuning with Minimal Distributional Assumptions

arXiv:2510.22734v1h-index: 2Proc. ACM Manag. Data
Originality Incremental advance
AI Analysis

This work improves database management system auto-tuning for practitioners by offering a more efficient and effective method, though it is incremental as it builds on existing tree-ensemble and gradient boosting techniques.

The paper tackles the problem of database auto-tuning by addressing limitations in existing Bayesian optimization methods, proposing Centrum, a novel auto-tuner that uses stochastic gradient boosting ensembles and conformal prediction to achieve distribution-free uncertainty estimation, and it outperforms 21 state-of-the-art methods in experiments on two DBMSs and three workloads.

Gaussian-Process-based Bayesian optimization (GP-BO), is a prevailing model-based framework for DBMS auto-tuning. However, recent work shows GP-BO-based DBMS auto-tuners significantly outperformed auto-tuners based on SMAC, which features random forest surrogate models; such results motivate us to rethink and investigate the limitations of GP-BO in auto-tuner design. We find the fundamental assumptions of GP-BO are widely violated when modeling and optimizing DBMS performance, while tree-ensemble-BOs (e.g., SMAC) can avoid the assumption pitfalls and deliver improved tuning efficiency and effectiveness. Moreover, we argue that existing tree-ensemble-BOs restrict further advancement in DBMS auto-tuning. First, existing tree-ensemble-BOs can only achieve distribution-free point estimates, but still impose unrealistic distributional assumptions on uncertainty estimates, compromising surrogate modeling and distort the acquisition function. Second, recent advances in gradient boosting, which can further enhance surrogate modeling against vanilla GP and random forest counterparts, have rarely been applied in optimizing DBMS auto-tuners. To address these issues, we propose a novel model-based DBMS auto-tuner, Centrum. Centrum improves distribution-free point and interval estimation in surrogate modeling with a two-phase learning procedure of stochastic gradient boosting ensembles. Moreover, Centrum adopts a generalized SGBE-estimated locally-adaptive conformal prediction to facilitate a distribution-free uncertainty estimation and acquisition function. To our knowledge, Centrum is the first auto-tuner to realize distribution-freeness, enhancing BO's practicality in DBMS auto-tuning, and the first to seamlessly fuse gradient boosting ensembles and conformal inference in BO. Extensive physical and simulation experiments on two DBMSs and three workloads show Centrum outperforms 21 SOTA methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes