DBSEMar 21, 2020

Causality-Guided Adaptive Interventional Debugging

arXiv:2003.09539v331 citations
Originality Incremental advance
AI Analysis

This addresses debugging challenges for developers dealing with intermittent failures in database applications, representing an incremental improvement by integrating existing techniques in a novel way.

The paper tackles debugging intermittent failures in database applications by proposing Adaptive Interventional Debugging (AID), which combines statistical debugging, causal analysis, fault injection, and group testing to pinpoint root causes and explain failure triggers, achieving faster identification than group testing and more precision than statistical debugging in evaluations with real-world and synthetic applications.

Runtime nondeterminism is a fact of life in modern database applications. Previous research has shown that nondeterminism can cause applications to intermittently crash, become unresponsive, or experience data corruption. We propose Adaptive Interventional Debugging (AID) for debugging such intermittent failures. AID combines existing statistical debugging, causal analysis, fault injection, and group testing techniques in a novel way to (1) pinpoint the root cause of an application's intermittent failure and (2) generate an explanation of how the root cause triggers the failure. AID works by first identifying a set of runtime behaviors (called predicates) that are strongly correlated to the failure. It then utilizes temporal properties of the predicates to (over)-approximate their causal relationships. Finally, it uses fault injection to execute a sequence of interventions on the predicates and discover their true causal relationships. This enables AID to identify the true root cause and its causal relationship to the failure. We theoretically analyze how fast AID can converge to the identification. We evaluate AID with six real-world applications that intermittently fail under specific inputs. In each case, AID was able to identify the root cause and explain how the root cause triggered the failure, much faster than group testing and more precisely than statistical debugging. We also evaluate AID with many synthetically generated applications with known root causes and confirm that the benefits also hold for them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes