DBLGJan 8, 2024

Sibyl: Forecasting Time-Evolving Query Workloads

Microsoft
arXiv:2401.03723v115 citationsh-index: 26Proc. ACM Manag. Data
Originality Incremental advance
AI Analysis

This addresses the challenge of optimizing database performance for future workloads, which is incremental but practically important for database administrators and systems.

The paper tackles the problem of forecasting time-evolving query workloads in database systems, achieving an 87.3% median F1 score and resulting in 1.7x and 1.3x performance improvements for materialized view and index selection applications.

Database systems often rely on historical query traces to perform workload-based performance tuning. However, real production workloads are time-evolving, making historical queries ineffective for optimizing future workloads. To address this challenge, we propose SIBYL, an end-to-end machine learning-based framework that accurately forecasts a sequence of future queries, with the entire query statements, in various prediction windows. Drawing insights from real-workloads, we propose template-based featurization techniques and develop a stacked-LSTM with an encoder-decoder architecture for accurate forecasting of query workloads. We also develop techniques to improve forecasting accuracy over large prediction windows and achieve high scalability over large workloads with high variability in arrival rates of queries. Finally, we propose techniques to handle workload drifts. Our evaluation on four real workloads demonstrates that SIBYL can forecast workloads with an $87.3\%$ median F1 score, and can result in $1.7\times$ and $1.3\times$ performance improvement when applied to materialized view selection and index selection applications, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes