Paris Carbone

DB
3papers
133citations
Novelty32%
AI Score44

3 Papers

35.0DBMay 28
The Missing Dimensions in Geo-Distributed Database Evaluation

Oto Mraz, Kyriakos Psarakis, George Christodoulou et al.

Geo-distributed OLTP databases are widely deployed across cloud regions, yet current evaluation practices do not cover the challenges of this aspect. Existing benchmarks assume stable network conditions; they lack explicit settings for data and client locality, and they largely ignore data transfer costs across regions. In addition, most evaluations rely on a limited set of geo-distribution patterns. In this paper, we propose Gaia, a comprehensive evaluation framework that addresses these gaps. We use Gaia to perform a comprehensive evaluation of existing geo-distributed OLTP systems. We deploy them across multiple cloud regions, using different geo-distribution patterns and variable cross-region network conditions. Among other interesting findings, our framework reveals that: i) most systems are sensitive to network instabilities, ii) network costs dominate cloud deployment expenses iii) multi-region fault-tolerance mechanisms incur measurable critical-path overhead that is often overlooked in prior evaluations. We argue that for the design of future geo-distributed databases, we must rethink the trade-offs between performance, fault-tolerance, and cost.

DCAug 3, 2020Code
A Survey on the Evolution of Stream Processing Systems

Marios Fragkoulis, Paris Carbone, Vasiliki Kalavri et al.

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between early ('00-'10) and modern ('11-'22) streaming systems, and discuss recent trends and open problems.

63.5DBMay 5
ConRAD: Conformal Risk-Aware Neural Databases

Sonia Horchidan, Fabian Zeiher, Xiangyu Shi et al.

Querying incomplete knowledge graphs with neural predictors is powerful but dangerous. Errors compound across multi-hop pipelines with no formal bound on the completeness of results. We introduce ConRAD, the first framework to enforce declarative recall guarantees natively within a neural graph database query engine. Given a user-specified risk budget, ConRAD automatically derives per-operator prediction thresholds that satisfy the recall target with finite-sample, distribution-free statistical validity via Conformal Risk Control, while maximizing end-to-end precision. To scale calibration across multi-operator query topologies, we introduce a quantile-space scalarization that reduces intractable high-dimensional threshold searches to a single parameter. We further design the conformal gate, a novel physical operator that dynamically bypasses neural inference when local graph evidence suffices, eliminating unnecessary model inferences in dense graph regions. Evaluated across three benchmarks and three query topologies, ConRAD strictly satisfies all risk budgets, with empirical recall falling below the target by at most 0.046 across all settings. It reduces neural invocations to zero in near-complete graph regions, and achieves precision that matches or exceeds best-case static baselines that offer no guarantees and require manual threshold search.