DBMar 7

Enhancing OLAP Resilience at LinkedIn

arXiv:2603.07382v1
Predicted impact top 19% in DB · last 90 daysOriginality Incremental advance
AI Analysis

This work provides specific resiliency mechanisms for large-scale OLAP systems, particularly beneficial for enterprises like LinkedIn that rely on Apache Pinot for critical real-time analytics.

This paper addresses the challenge of maintaining strict SLAs for real-time OLAP datastores at LinkedIn under various disruptions. They developed a set of resiliency mechanisms for Apache Pinot, including Query Workload Isolation (QWI) which provides CPU and memory budgeting with under 1% overhead, and Impact-Free Rebalancing for SLA-safe data movement.

Real-time OLAP datastores are critical infrastructure for modern enterprises, powering interactive analytics on petabyte-scale datasets with subsecond latency requirements. As these systems become integral to service architectures, maintaining strict SLAs under failures, load spikes, and cluster changes is as important as raw performance. We present a set of resiliency mechanisms developed for Apache Pinot at LinkedIn, applicable to modern OLAP systems broadly. We introduce Query Workload Isolation (QWI), which provides workload-level CPU and memory budgeting across Pinot's broker and server tiers via fine-grained resource accounting and sub-millisecond enforcement, delivering predictable tail latency and fairness with under 1% overhead. We present Impact-Free Rebalancing for SLA-safe data movement during routine operations (e.g., upgrades, scale-out, and recovery), and Maintenance Zone Awareness to place replicas across fault domains and mitigate correlated failures. We also describe Adaptive Server Selection, which routes queries using real-time load and performance signals to avoid slow or failing nodes while preserving balanced utilization. Together, these mechanisms form a holistic resiliency framework deployed in production at LinkedIn, enabling stable query latency and high availability at scale.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes