DBLGSep 10, 2022

Share the Tensor Tea: How Databases can Leverage the Machine Learning Ecosystem

MicrosoftUW
arXiv:2209.04579v116 citationsh-index: 34
AI Analysis

This work addresses the challenge for database systems to efficiently handle queries with both relational and ML operators, enabling broader hardware and software integration, though it is incremental in leveraging existing tensor runtimes.

The paper tackles the problem of integrating relational database queries with machine learning ecosystems by introducing Tensor Query Processor (TQP), which automatically compiles relational operators into tensor programs, resulting in performance comparable to or better than specialized CPU and GPU query processors on benchmarks like TPC-H.

We demonstrate Tensor Query Processor (TQP): a query processor that automatically compiles relational operators into tensor programs. By leveraging tensor runtimes such as PyTorch, TQP is able to: (1) integrate with ML tools (e.g., Pandas for data ingestion, Tensorboard for visualization); (2) target different hardware (e.g., CPU, GPU) and software (e.g., browser) backends; and (3) end-to-end accelerate queries containing both relational and ML operators. TQP is generic enough to support the TPC-H benchmark, and it provides performance that is comparable to, and often better than, that of specialized CPU and GPU query processors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes