DBAIMay 6, 2021

A Unified Transferable Model for ML-Enhanced DBMS

arXiv:2105.02418v337 citations
AI Analysis

This work addresses the problem of high training costs and limited effectiveness for DBMS practitioners, offering a more practical approach for cloud DB services, though it appears incremental in building on existing ML methods.

The paper tackles the inefficiency and impracticality of existing machine learning solutions in database management systems (DBMS) by proposing a unified transferable model, MTMLF, which uses multi-task training and pre-train fine-tune procedures to improve performance across tasks and databases, achieving promising results in a query optimization case study.

Recently, the database management system (DBMS) community has witnessed the power of machine learning (ML) solutions for DBMS tasks. Despite their promising performance, these existing solutions can hardly be considered satisfactory. First, these ML-based methods in DBMS are not effective enough because they are optimized on each specific task, and cannot explore or understand the intrinsic connections between tasks. Second, the training process has serious limitations that hinder their practicality, because they need to retrain the entire model from scratch for a new DB. Moreover, for each retraining, they require an excessive amount of training data, which is very expensive to acquire and unavailable for a new DB. We propose to explore the transferabilities of the ML methods both across tasks and across DBs to tackle these fundamental drawbacks. In this paper, we propose a unified model MTMLF that uses a multi-task training procedure to capture the transferable knowledge across tasks and a pre-train fine-tune procedure to distill the transferable meta knowledge across DBs. We believe this paradigm is more suitable for cloud DB service, and has the potential to revolutionize the way how ML is used in DBMS. Furthermore, to demonstrate the predicting power and viability of MTMLF, we provide a concrete and very promising case study on query optimization tasks. Last but not least, we discuss several concrete research opportunities along this line of work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes