DBLGAug 13, 2019

Adaptive Learning of Aggregate Analytics under Dynamic Workloads

arXiv:1908.04772v20.007 citations
AI Analysis50

This work addresses the problem of reducing query response time and resource consumption for organizations using big data infrastructures, though it is incremental in adapting existing ML techniques to dynamic workloads.

The paper tackles the high cost of aggregate analytics on large distributed data by introducing a lightweight client-side machine learning mechanism that estimates query answers in milliseconds, avoiding expensive backend processing, and demonstrates its effectiveness through extensive evaluation.

Large organizations have seamlessly incorporated data-driven decision making in their operations. However, as data volumes increase, expensive big data infrastructures are called to rescue. In this setting, analytics tasks become very costly in terms of query response time, resource consumption, and money in cloud deployments, especially when base data are stored across geographically distributed data centers. Therefore, we introduce an adaptive Machine Learning mechanism which is light-weight, stored client-side, can estimate the answers of a variety of aggregate queries and can avoid the big data backend. The estimations are performed in milliseconds are inexpensive and accurate as the mechanism learns from past analytical-query patterns. However, as analytic queries are ad-hoc and analysts' interests change over time we develop solutions that can swiftly and accurately detect such changes and adapt to new query patterns. The capabilities of our approach are demonstrated using extensive evaluation with real and synthetic datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes