LGAIFeb 2, 2019

Learned Indexes for Dynamic Workloads

arXiv:1902.00655v19 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in database indexing for real-world applications with dynamic data, though it is incremental as it builds on existing learned index methods.

The paper tackles the problem of learned index structures being ineffective for dynamic workloads with skew query distributions and evolving data, proposing Doraemon which improves query latency by 45.1% and reduces model re-training time to 1/20.

The recent proposal of learned index structures opens up a new perspective on how traditional range indexes can be optimized. However, the current learned indexes assume the data distribution is relatively static and the access pattern is uniform, while real-world scenarios consist of skew query distribution and evolving data. In this paper, we demonstrate that the missing consideration of access patterns and dynamic data distribution notably hinders the applicability of learned indexes. To this end, we propose solutions for learned indexes for dynamic workloads (called Doraemon). To improve the latency for skew queries, Doraemon augments the training data with access frequencies. To address the slow model re-training when data distribution shifts, Doraemon caches the previously-trained models and incrementally fine-tunes them for similar access patterns and data distribution. Our preliminary result shows that, Doraemon improves the query latency by 45.1% and reduces the model re-training time to 1/20.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes