SEJun 8, 2020

Maximizing Error Injection Realism for Chaos Engineering with System Calls

arXiv:2006.04444v71 citations
AI Analysis

This addresses reliability testing for developers in chaos engineering by providing realistic error injection, though it is incremental as it builds on existing fault injection methods.

The paper tackles the problem of assessing application reliability by introducing Phoebe, a fault injection framework for system call errors that mimics production errors, and it successfully detects reliability weaknesses in real-world applications.

In this paper, we present a novel fault injection framework for system call invocation errors, called Phoebe. Phoebe is unique as follows. First, Phoebe enables developers to have full observability of system call invocations. Second, Phoebe generates error models that are realistic in the sense that they mimic errors that naturally happen in production. Third, Phoebe is able to automatically conduct experiments to systematically assess the reliability of applications with respect to system call invocation errors in production. We evaluate the effectiveness and runtime overhead of Phoebe on two real-world applications in a production environment. The results show that Phoebe successfully generates realistic error models and is able to detect important reliability weaknesses with respect to system call invocation errors. To our knowledge, this novel concept of "realistic error injection", which consists of grounding fault injection on production errors, has never been studied before.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes