Column-Oriented Datalog Materialization for Large Knowledge Graphs (Extended Technical Report)
This work addresses efficiency challenges in Datalog evaluation for large knowledge graphs, which is crucial for applications relying on such data, though it appears incremental as it builds on existing methods with specific optimizations.
The paper tackles the problem of efficiently materializing Datalog inferences over large knowledge graphs by introducing a column-based memory layout with runtime optimizations to avoid redundancy, achieving performance that matches or surpasses state-of-the-art systems, particularly under resource constraints.
The evaluation of Datalog rules over large Knowledge Graphs (KGs) is essential for many applications. In this paper, we present a new method of materializing Datalog inferences, which combines a column-based memory layout with novel optimization methods that avoid redundant inferences at runtime. The pro-active caching of certain subqueries further increases efficiency. Our empirical evaluation shows that this approach can often match or even surpass the performance of state-of-the-art systems, especially under restricted resources.