PLDBLGMar 10, 2021

Functional Collection Programming with Semi-Ring Dictionaries

arXiv:2103.06376v359 citations
AI Analysis

This addresses the challenge of integrating database and linear algebra optimizations for researchers and practitioners in data processing, representing a novel unification rather than an incremental improvement.

This paper tackles the problem of efficiently processing hybrid database and linear algebra workloads by introducing semi-ring dictionaries and the SDQL language, which unifies optimizations from both domains and demonstrates competitive or superior performance against state-of-the-art systems in experiments.

This paper introduces semi-ring dictionaries, a powerful class of compositional and purely functional collections that subsume other collection types such as sets, multisets, arrays, vectors, and matrices. We developed SDQL, a statically typed language that can express relational algebra with aggregations, linear algebra, and functional collections over data such as relations and matrices using semi-ring dictionaries. Furthermore, thanks to the algebraic structure behind these dictionaries, SDQL unifies a wide range of optimizations commonly used in databases (DB) and linear algebra (LA). As a result, SDQL enables efficient processing of hybrid DB and LA workloads, by putting together optimizations that are otherwise confined to either DB systems or LA frameworks. We show experimentally that a handful of DB and LA workloads can take advantage of the SDQL language and optimizations. SDQL can be competitive with or outperforms a host of systems that are state of the art in their own domain: in-memory DB systems Typer and Tectorwise for (flat, not nested) relational data; SciPy for LA workloads; sparse tensor compiler taco; the Trance nested relational engine; and the in-database machine learning engines LMFAO and Morpheus for hybrid DB/LA workloads over relational data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes