DBLGJun 15, 2020

NeuroCard: One Cardinality Estimator for All Tables

arXiv:2006.08109v2138 citations
AI Analysis

This addresses a critical bottleneck in query optimization for database systems, offering a novel solution that significantly improves accuracy and scalability.

The paper tackled the problem of inaccurate cardinality estimation for complex queries in databases by developing NeuroCard, a neural density estimator that learns correlations across all tables without independence assumptions, achieving an 8.5× maximum error reduction on JOB-light and scaling to dozens of tables with compact size and fast construction.

Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5$\times$ maximum error on JOB-light), scales to dozens of tables, while being compact in space (several MBs) and efficient to construct or update (seconds to minutes).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes