DBLGDec 24, 2021

Fine-Tuning Data Structures for Analytical Query Processing

arXiv:2112.13099v1
Originality Incremental advance
AI Analysis

This work addresses performance bottlenecks in analytical workloads for database systems, representing an incremental improvement through hybrid methods.

The paper tackles the problem of optimizing data structures for analytical query processing by introducing a framework that automatically selects efficient implementations based on a novel intermediate language and a learned cost model. The results show that the generated code either outperforms or matches state-of-the-art analytical query engines and in-database machine learning frameworks.

We introduce a framework for automatically choosing data structures to support efficient computation of analytical workloads. Our contributions are twofold. First, we introduce a novel low-level intermediate language that can express the algorithms behind various query processing paradigms such as classical joins, groupjoin, and in-database machine learning engines. This language is designed around the notion of dictionaries, and allows for a more fine-grained choice of its low-level implementation. Second, the cost model for alternative implementations is automatically inferred by combining machine learning and program reasoning. The dictionary cost model is learned using a regression model trained over the profiling dataset of dictionary operations on a given hardware architecture. The program cost model is inferred using static program analysis. Our experimental results show the effectiveness of the trained cost model on micro benchmarks. Furthermore, we show that the performance of the code generated by our framework either outperforms or is on par with the state-of-the-art analytical query engines and a recent in-database machine learning framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes