The Cascade Log: Reference-Stable Windowing over Tiered Append Sequences
This work solves the problem of stable references in tiered append-only data structures for systems like edit logs and event stores, providing both anomaly freedom and provable efficiency bounds.
The Cascade Log introduces a reference-stable tiered append structure that prevents cross-tier anomalies (dangling, stale, corrupt, and snapshot-skewed resolution) in long-running append-mostly sequences. The index uses Θ(A) space, resolves a point in O(log A), and handles updates in O((a/B+s) log A) work, with matching lower bounds showing optimality.
A long-running append-mostly sequence, such as an edit log, event store, or versioned working set, is usually tiered into a bounded hot stratum and colder folded summaries. This saves memory but breaks stable references: a handle minted while a record is hot may later be resolved after the record has moved into a digest, after it has been superseded, or while a fold is in flight. We define the resulting cross-tier anomalies--dangling, stale, corrupt, and snapshot-skewed resolution--and present the Cascade Log, a reference-stable tiered append structure. The structure keeps a single persistent coalescing interval map over handles as the sole authority on each live version; folding a contiguous run replaces many singleton entries by one digest-backed interval node, and immutable roots provide snapshot tokens. Its cost is characterized by the fragmentation $A$, the number of index pieces, namely live handles plus maximal same-digest runs. The index uses $Θ(A)$ space, resolves a point in $O(\log A)$, reports a $k$-handle range in $O(\log A+k)$, and performs $a$ appends and $s$ supersedes in $O((a/B+s)\log A)$ update work for fold block size $B$. Matching lower bounds show that $Ω(A)$ space and $Ω(\log A+k)$ ordered range cost are unavoidable, and an adversary can force $A=Θ(s)$. Thus the index is sublinear on append-dominated histories and grows linearly only under fragmenting edits. A reference implementation and reproducible experiments to $10^6$ records validate the anomaly-freedom and the fragmentation bounds.