15.1DBMay 22
A Pragmatic Approach to Learned Indexing in RocksDB: Targeted Optimizations with Minimal System ModificationShubham Vashisth, Olivier Michaud, Bettina Kemme et al.
Learned indexes have emerged as a promising alternative to traditional index structures, offering higher throughput and lower memory usage by approximating the cumulative key distribution function with lightweight models. Despite these benefits, adoption in production systems remains limited, partly because learned indexes that support concurrency and persistence as effectively as, e.g., the B+-Tree, do not yet exist, while many research prototypes introduce substantial complexity. In this paper, we investigate whether off-the-shelf learned indexes can be integrated into a production database with minimal storage-engine redesign. Using RocksDB as a case study, we exploit its separation between in-memory Memtables and immutable on-disk files to deploy specialized indexes at each level. We show that directly applying existing learned indexes is insufficient under write-heavy workloads because frequent Memtable replacement prevents models from fully adapting. To address this, we introduce a reuse mechanism that preserves structural knowledge across Memtable instances. At the storage level, we replace RocksDB's disk index with a learned index without modifying the storage layer or read path. We further adapt a read-only learned index to be block-aware, enabling worst-case single-I/O lookups. We implement these techniques in MountDB, an extension of RocksDB. Experiments on large-scale workloads with diverse data distributions and access patterns show up to 1.5X higher write throughput and 2.1X higher read throughput than state-of-the-art systems, demonstrating that established learned indexes can be integrated into production systems with minimal overhead and substantial performance benefits.
37.0DBMay 4
Unfair by design: eBPF-based scheduling of mixed database workloadsCarl-Elliott Bilodeau-Savaria, Jan Kristof Nidzwetzki, Stefanie Scherzinger et al.
Modern database systems increasingly co-schedule time-sensitive and background tasks. In such mixed workloads, background tasks should ideally utilize only spare CPU capacity without interfering with latency-critical requests. While some database-level solutions address this challenge, many database systems still rely on operating system (OS) schedulers, which, despite supporting priorities, do not reliably isolate high-priority tasks. Furthermore, they remain vulnerable to priority inversion, where preempted background tasks can delay other work. We present UFS, a selectively unfair scheduler implemented as an eBPF-based sched_ext scheduler in the Linux kernel. UFS restricts background tasks to idle CPU capacity and preempts them immediately when time-sensitive tasks arrive. To address priority inversion, UFS incorporates application-level hints via eBPF maps, ensuring that background tasks are not unnecessarily delayed should time-sensitive tasks wait for them to release locks. Our integration of UFS into PostgreSQL demonstrates that, under mixed workloads, UFS improves throughput for time-sensitive tasks by up to 2X, while reducing tail latency by half, compared to existing scheduling options in Linux.
50.1DBMar 12
Seeing the Trees for the Forest: Leveraging Tree-Shaped Substructures in Property GraphsDaniel Aarao Reis Arturi, Christoph Köhnen, George Fletcher et al.
Property graphs often contain tree-shaped substructures, yet they are not captured by existing proposals for graph schemas; likewise, query languages and query engines offer little-to-no native support for managing them systematically. As a first contribution, we report on a micro experiment that demonstrates the optimization potential of treating tree-shaped substructures as first class citizens in graph database systems. In particular, we show that in systems backed by relational engines, we can achieve substantial speedups by leveraging structural indexes, as originally developed for XML databases, to accelerate path queries. Based on our findings, we put forward a vision in which tree-shaped substructures are systematically managed throughout the graph query lifecycle, from modeling and schema design to indexing and query processing, and outline arising research questions.