SOSD: A Benchmark for Learned Indexes
This addresses the need for rigorous evaluation in data management systems to validate the performance of learned components, though it is incremental as it focuses on benchmarking rather than introducing new methods.
The paper tackles the skepticism about whether learned index structures outperform traditional ones by proposing a benchmarking framework with real-world datasets and baseline implementations, finding that learned models often outperform state-of-the-art implementations.
A groundswell of recent work has focused on improving data management systems with learned components. Specifically, work on learned index structures has proposed replacing traditional index structures, such as B-trees, with learned models. Given the decades of research committed to improving index structures, there is significant skepticism about whether learned indexes actually outperform state-of-the-art implementations of traditional structures on real-world data. To answer this question, we propose a new benchmarking framework that comes with a variety of real-world datasets and baseline implementations to compare against. We also show preliminary results for selected index structures, and find that learned models indeed often outperform state-of-the-art implementations, and are therefore a promising direction for future research.