DB LGOct 24, 2022

Deploying a Steered Query Optimizer in Production at Microsoft

Wangda Zhang, Matteo Interlandi, Paul Mineiro, Shi Qiao, Nasim Ghazanfari Karlen Lie, Marc Friedman, Rafah Hosn, Hiren Patel, Alekh Jindal

arXiv:2210.13625v15.931 citationsh-index: 26

Originality Incremental advance

AI Analysis

This work addresses performance optimization for large-scale data processing at Microsoft, but it is incremental as it builds on prior research in steering query optimizers.

The paper tackles the problem of generic query optimizers being inadequate for heterogeneous analytical workloads by specializing them through steering, resulting in a production system at Microsoft that is enabled by default and shows detailed improvements in production SCOPE workloads.

Modern analytical workloads are highly heterogeneous and massively complex, making generic query optimizers untenable for many customers and scenarios. As a result, it is important to specialize these optimizers to instances of the workloads. In this paper, we continue a recent line of work in steering a query optimizer towards better plans for a given workload, and make major strides in pushing previous research ideas to production deployment. Along the way we solve several operational challenges including, making steering actions more manageable, keeping the costs of steering within budget, and avoiding unexpected performance regressions in production. Our resulting system, QQ-advisor, essentially externalizes the query planner to a massive offline pipeline for better exploration and specialization. We discuss various aspects of our design and show detailed results over production SCOPE workloads at Microsoft, where the system is currently enabled by default.

View on arXiv PDF

Similar