DBDCLGDec 23, 2020

Learned Indexes for a Google-scale Disk-based Database

arXiv:2012.12501v146 citations
AI Analysis

This work addresses the practical integration of learned indexes into a large-scale, disk-based database system, which is a significant incremental step for database practitioners.

This paper integrates learned index structures into Google's Bigtable, a distributed, disk-based database. The integration significantly improves end-to-end read latency and throughput for Bigtable.

There is great excitement about learned index structures, but understandable skepticism about the practicality of a new method uprooting decades of research on B-Trees. In this paper, we work to remove some of that uncertainty by demonstrating how a learned index can be integrated in a distributed, disk-based database system: Google's Bigtable. We detail several design decisions we made to integrate learned indexes in Bigtable. Our results show that integrating learned index significantly improves the end-to-end read latency and throughput for Bigtable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes