Calisto Zuzarte

h-index18
2papers

2 Papers

DBJan 22, 2024
LearnedWMP: Workload Memory Prediction Using Distribution of Query Templates

Shaikh Quader, Andres Jaramillo, Sumona Mukhopadhyay et al.

In a modern DBMS, working memory is frequently the limiting factor when processing in-memory analytic query operations such as joins, sorting, and aggregation. Existing resource estimation approaches for a DBMS estimate the resource consumption of a query by computing an estimate of each individual database operator in the query execution plan. Such an approach is slow and error-prone as it relies upon simplifying assumptions, such as uniformity and independence of the underlying data. Additionally, the existing approach focuses on individual queries separately and does not factor in other queries in the workload that may be executed concurrently. In this research, we are interested in query performance optimization under concurrent execution of a batch of queries (a workload). Specifically, we focus on predicting the memory demand for a workload rather than providing separate estimates for each query within it. We introduce the problem of workload memory prediction and formalize it as a distribution regression problem. We propose Learned Workload Memory Prediction (LearnedWMP) to improve and simplify estimating the working memory demands of workloads. Through a comprehensive experimental evaluation, we show that LearnedWMP reduces the memory estimation error of the state-of-the-practice method by up to 47.6%. Compared to an alternative single-query model, during training and inferencing, the LearnedWMP model and its variants were 3x to 10x faster. Moreover, LearnedWMP-based models were at least 50% smaller in most cases. Overall, the results demonstrate the advantages of the LearnedWMP approach and its potential for a broader impact on query performance optimization.

DBJan 26, 2024
Robust Plan Evaluation based on Approximate Probabilistic Machine Learning

Amin Kamali, Verena Kantere, Calisto Zuzarte et al.

Query optimizers in RDBMSs search for execution plans expected to be optimal for given queries. They use parameter estimates, often inaccurate, and make assumptions that may not hold in practice. Consequently, they may select plans that are suboptimal at runtime if estimates and assumptions are not valid. Therefore, they do not sufficiently support robust query optimization. Using ML to improve data systems has shown promising results for query optimization. Inspired by this, we propose Robust Query Optimizer (Roq), a holistic framework based on a risk-aware learning approach. Roq includes a novel formalization of the notion of robustness in the context of query optimization and a principled approach for its quantification and measurement based on approximate probabilistic ML. It also includes novel strategies and algorithms for query plan evaluation and selection. Roq includes a novel learned cost model that is designed to predict the cost of query execution and the associated risks and performs query optimization accordingly. We demonstrate that Roq provides significant improvements in robust query optimization compared with the state-of-the-art.