DBAIOSJul 30, 2023

Towards Learned Predictability of Storage Systems

arXiv:2307.16288v11 citationsh-index: 5
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper that addresses reliability challenges in storage systems for cloud computing and big data applications.

The paper surveys machine learning-based black-box approaches for predicting performance and failures in storage systems to improve reliability, low tail latency, and high availability, but does not present new experimental results or concrete numbers.

With the rapid development of cloud computing and big data technologies, storage systems have become a fundamental building block of datacenters, incorporating hardware innovations such as flash solid state drives and non-volatile memories, as well as software infrastructures such as RAID and distributed file systems. Despite the growing popularity and interests in storage, designing and implementing reliable storage systems remains challenging, due to their performance instability and prevailing hardware failures. Proactive prediction greatly strengthens the reliability of storage systems. There are two dimensions of prediction: performance and failure. Ideally, through detecting in advance the slow IO requests, and predicting device failures before they really happen, we can build storage systems with especially low tail latency and high availability. While its importance is well recognized, such proactive prediction in storage systems, on the other hand, is particularly difficult. To move towards predictability of storage systems, various mechanisms and field studies have been proposed in the past few years. In this report, we present a survey of these mechanisms and field studies, focusing on machine learning based black-box approaches. Based on three representative research works, we discuss where and how machine learning should be applied in this field. The strengths and limitations of each research work are also evaluated in detail.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes