Kepler: Robust Learning for Faster Parametric Query Optimization
This work addresses performance inefficiencies in database query optimization for users relying on parametric queries, representing a novel method for a known bottleneck.
The paper tackles the problem of inaccurate cost models in parametric query optimization (PQO) by proposing Kepler, an end-to-end learning-based approach that uses actual execution data and a novel plan generation algorithm to predict faster query plans, resulting in significant speedups in query latency over traditional optimizers on PostgreSQL datasets.
Most existing parametric query optimization (PQO) techniques rely on traditional query optimizer cost models, which are often inaccurate and result in suboptimal query performance. We propose Kepler, an end-to-end learning-based approach to PQO that demonstrates significant speedups in query latency over a traditional query optimizer. Central to our method is Row Count Evolution (RCE), a novel plan generation algorithm based on perturbations in the sub-plan cardinality space. While previous approaches require accurate cost models, we bypass this requirement by evaluating candidate plans via actual execution data and training an ML model to predict the fastest plan given parameter binding values. Our models leverage recent advances in neural network uncertainty in order to robustly predict faster plans while avoiding regressions in query performance. Experimentally, we show that Kepler achieves significant improvements in query runtime on multiple datasets on PostgreSQL.