Living Databases: A Unified Model for Continuous Schema Evolution, Versioning, and Transformations
This work addresses the fragmentation in database evolution research by providing a single framework that unifies previously isolated functionalities, benefiting database system designers and researchers.
The paper proposes a unified abstraction for continuous schema evolution, versioning, and transformations in databases, integrating provenance tracking, conditional update propagation, and configurable alerts. Initial experimental results from a prototype based on a modified Prolly Tree data structure demonstrate feasibility.
Databases, and datasets more generally, evolve continuously through updates, transformations, versioning, schema changes, streaming operations, and other mechanisms. While prior work has noted connections among some of these areas, they have traditionally been studied in isolation, each with its own abstractions, algorithms, and system implementations. In this paper, we argue for unifying these diverse functionalities under a single abstraction and a common set of computational primitives. We present such an abstraction, powerful enough to encompass existing use cases and to support new ones. Going beyond previous approaches, our framework seamlessly integrates provenance tracking for system-visible operations, conditional propagation of updates, and configurable alerts on change events. It also offers a principled treatment of dependent objects such as views and derived artifacts like machine learning models, by providing declarative mechanisms to control their evolution. Finally, we sketch a prototype implementation in a relational-like database system based on an adaptation of the "Prolly Tree", a Merkle tree-inspired data structure with tunable parameters to meet varying performance requirements, and present some initial experimental results.