DCMar 9

SafarDB: FPGA-Accelerated Distributed Transactions via Replicated Data Types

arXiv:2603.08003v1
Predicted impact top 87% in DC · last 90 daysOriginality Highly original
AI Analysis

This work significantly improves the performance and resilience of distributed transaction systems for data centers by leveraging FPGA acceleration, addressing the bottleneck of data replication coordination.

SafarDB is an FPGA-accelerated distributed transaction system that uses Replicated Data Types (RDTs) to improve data replication in data centers. It achieves a 7.0x reduction in latency and a 5.3x increase in throughput for CRDTs, and a 12x reduction in latency and a 6.8x increase in throughput for WRDTs, compared to state-of-the-art RDMA-based implementations.

Data replication is a critical aspect of data center design, as it ensures high availability, scalability, and fault tolerance. However, replicas need to be coordinated to maintain convergence and database integrity constraints under transactional workloads. Commutative Replicated Data Types (RDTs) provide convergence for conflict-free objects using relaxed consistency, and Well-coordinated Replicated Data Types (WRDTs) provide convergence and integrity for general objects using a hybrid model, relaxed when possible and strong when necessary. While state-of-the-art hardware acceleration of RDT uses Remote Direct Memory Access (RDMA), we observe that trends towards lower latency and higher throughput have driven recent data center architectures to leverage FPGAs as application accelerators. In contrast to deploying an FPGA-based Smart NIC, this paper connects an FPGA accelerator card directly to the network, which allows a complete redesign of the NIC to match the needs of the FPGA-hosted application. We co-design a network-attached FPGA replication engine with an FPGA-resident network interface, enabling near-network execution of replicated transactions and direct invocation of FPGA-resident operators. Following this approach, we introduce SafarDB, FPGA-accelerated Conflict-Free Replicated Data Types (CRDTs) and WRDTs. SafarDB accelerates both relaxed and strongly ordered replication paths; when strong ordering is required, SafarDB accelerates the underlying consensus control path. SafarDB improves CRDT latency and throughput by 7.0X and 5.3X, and WRDT latency and throughput by 12X and 6.8X compared to a state-of-the-art RDMA-based implementation. Further, experiments demonstrate that SafarDB is more resilient to crash-failures than existing CPU/RDMA-based CRDT and WRDT implementations, and SafarDB can detect leader failures and elect new leaders much faster than previously possible.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes