DBAIMar 8, 2021

A Reinforcement Learning Based R-Tree for Spatial Data Indexing in Dynamic Environments

arXiv:2103.04541v26 citations
AI Analysis

This work addresses a domain-specific problem for database systems by offering an incremental improvement over existing R-Tree variants without requiring structural changes.

The paper tackles the problem of improving query performance in spatial data indexing by using reinforcement learning to optimize R-Tree construction decisions, such as subtree selection and node splitting, resulting in faster query processing times on datasets with over 100 million objects.

Learned indices have been proposed to replace classic index structures like B-Tree with machine learning (ML) models. They require to replace both the indices and query processing algorithms currently deployed by the databases, and such a radical departure is likely to encounter challenges and obstacles. In contrast, we propose a fundamentally different way of using ML techniques to improve on the query performance of the classic R-Tree without the need of changing its structure or query processing algorithms. Specifically, we develop reinforcement learning (RL) based models to decide how to choose a subtree for insertion and how to split a node when building an R-Tree, instead of relying on hand-crafted heuristic rules currently used by R-Tree and its variants. Experiments on real and synthetic datasets with up to more than 100 million spatial objects clearly show that our RL based index outperforms R-Tree and its variants in terms of query processing time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes