DBAIFeb 14, 2023

Lero: A Learning-to-Rank Query Optimizer

arXiv:2302.06873v2103 citationsh-index: 62
AI Analysis

This work addresses performance issues in database query optimization for users of DBMS, offering a non-intrusive solution that leverages existing systems, though it is incremental as it builds on top of native optimizers rather than creating a new paradigm.

The paper tackles the problem of unstable performance and high training costs in learned query optimizers by introducing Lero, a learning-to-rank optimizer that improves upon native optimizers. It reduces plan execution time by up to 70% compared to PostgreSQL's native optimizer and up to 37% compared to other learned optimizers, achieving near-optimal performance on benchmarks.

A recent line of works apply machine learning techniques to assist or rebuild cost-based query optimizers in DBMS. While exhibiting superiority in some benchmarks, their deficiencies, e.g., unstable performance, high training cost, and slow model updating, stem from the inherent hardness of predicting the cost or latency of execution plans using machine learning models. In this paper, we introduce a learning-to-rank query optimizer, called Lero, which builds on top of a native query optimizer and continuously learns to improve the optimization performance. The key observation is that the relative order or rank of plans, rather than the exact cost or latency, is sufficient for query optimization. Lero employs a pairwise approach to train a classifier to compare any two plans and tell which one is better. Such a binary classification task is much easier than the regression task to predict the cost or latency, in terms of model efficiency and accuracy. Rather than building a learned optimizer from scratch, Lero is designed to leverage decades of wisdom of databases and improve the native query optimizer. With its non-intrusive design, Lero can be implemented on top of any existing DBMS with minimal integration efforts. We implement Lero and demonstrate its outstanding performance using PostgreSQL. In our experiments, Lero achieves near optimal performance on several benchmarks. It reduces the plan execution time of the native optimizer in PostgreSQL by up to 70% and other learned query optimizers by up to 37%. Meanwhile, Lero continuously learns and automatically adapts to query workloads and changes in data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes