Causal fault localisation in dataflow systems
This addresses fault localization for developers using dataflow systems, but it is incremental as it provides practical validation of an existing idea.
The paper tackled the problem of fault localization in dataflow systems by applying causal inference techniques to dataflow graphs, demonstrating detection of software bugs and data shifts across three modern dataflow engines.
Dataflow computing was shown to bring significant benefits to multiple niches of systems engineering and has the potential to become a general-purpose paradigm of choice for data-driven application development. One of the characteristic features of dataflow computing is the natural access to the dataflow graph of the entire system. Recently it has been observed that these dataflow graphs can be treated as complete graphical causal models, opening opportunities to apply causal inference techniques to dataflow systems. In this demonstration paper we aim to provide the first practical validation of this idea with a particular focus on causal fault localisation. We provide multiple demonstrations of how causal inference can be used to detect software bugs and data shifts in multiple scenarios with three modern dataflow engines.