CRSep 24, 2019
ProvMark: A Provenance Expressiveness Benchmarking SystemSheung Chi Chan, James Cheney, Pramod Bhatotia et al.
System level provenance is of widespread interest for applications such as security enforcement and information protection. However, testing the correctness or completeness of provenance capture tools is challenging and currently done manually. In some cases there is not even a clear consensus about what behavior is correct. We present an automated tool, ProvMark, that uses an existing provenance system as a black box and reliably identifies the provenance graph structure recorded for a given activity, by a reduction to subgraph isomorphism problems handled by an external solver. ProvMark is a beginning step in the much needed area of testing and comparing the expressiveness of provenance systems. We demonstrate ProvMark's usefuless in comparing three capture systems with different architectures and distinct design philosophies.
AIMar 4, 2016
Causal inference for data-driven debugging and decision making in cloud computingPhilipp Geiger, Lucian Carata, Bernhard Schoelkopf
Cloud computing involves complex technical and economical systems and interactions. This brings about various challenges, two of which are: (1) debugging and control to optimize the performance of computing systems, with the help of sandbox experiments, and (2) privacy-preserving prediction of the cost of ``spot'' resources for decision making of cloud clients. In this paper, we formalize debugging by counterfactual probabilities and control by post-(soft-)interventional probabilities. We prove that counterfactuals can approximately be calculated from a ``stochastic'' graphical causal model (while they are originally defined only for ``deterministic'' functional causal models), and based on this sketch a data-driven approach to address problem (1). To address problem (2), we formalize bidding by post-(soft-)interventional probabilities and present a simple mathematical result on approximate integration of ``incomplete'' conditional probability distributions. We show how this can be used by cloud clients to trade off privacy against predictability of the outcome of their bidding actions in a toy scenario. We report experiments on simulated and real data.