Practical Aspect of Privacy-Preserving Data Publishing in Process Mining
This work addresses privacy issues for process analysts in real-world applications, but it is incremental as it focuses on integrating existing techniques into a practical tool.
The paper tackles the problem of balancing confidentiality and utility in process mining by introducing a Python-based infrastructure that implements state-of-the-art privacy preservation techniques, providing a hierarchy of usages from single techniques to integrated web-based tools.
Process mining techniques such as process discovery and conformance checking provide insights into actual processes by analyzing event data that are widely available in information systems. These data are very valuable, but often contain sensitive information, and process analysts need to balance confidentiality and utility. Privacy issues in process mining are recently receiving more attention from researchers which should be complemented by a tool to integrate the solutions and make them available in the real world. In this paper, we introduce a Python-based infrastructure implementing state-of-the-art privacy preservation techniques in process mining. The infrastructure provides a hierarchy of usages from single techniques to the collection of techniques, integrated as web-based tools. Our infrastructure manages both standard and non-standard event data resulting from privacy preservation techniques. It also stores explicit privacy metadata to track the modifications applied to protect sensitive data.