DBAIMar 16, 2017

Database Learning: Toward a Database that Becomes Smarter Every Time

arXiv:1703.05468v276 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental inefficiency in databases for users and systems relying on approximate query processing, representing a new paradigm rather than an incremental improvement.

The paper tackles the problem of databases not leveraging past query answers to improve future query processing, introducing Database Learning to exploit knowledge from previous queries for faster and more accurate responses. Results show that their system, Verdict, supports 73.7% of real-world queries and speeds them up by up to 23.0x compared to existing approximate query processing systems at the same accuracy level.

In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce answers, which are in expectation guaranteed to be more accurate than existing sample-based approximations. Empowered by this idea, we build a query engine on top of Spark SQL, called Verdict. We conduct extensive experiments on real-world query traces from a large customer of a major database vendor. Our results demonstrate that Verdict supports 73.7% of these queries, speeding them up by up to 23.0x for the same accuracy level compared to existing AQP systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes