Privacy-Preserving Data Publishing in Process Mining
This work is significant for organizations and researchers who need to analyze process data while adhering to privacy regulations, by providing a structured way to manage privacy transformations.
This paper addresses the challenge of publishing privacy-aware event data in process mining by formally defining anonymization operations and creating an infrastructure for recording privacy metadata. The authors propose a privacy extension for the XES standard and a general data structure for event data to facilitate this.
Process mining aims to provide insights into the actual processes based on event data. These data are often recorded by information systems and are widely available. However, they often contain sensitive private information that should be analyzed responsibly. Therefore, privacy issues in process mining are recently receiving more attention. Privacy preservation techniques obviously need to modify the original data, yet, at the same time, they are supposed to preserve the data utility. Privacy-preserving transformations of the data may lead to incorrect or misleading analysis results. Hence, new infrastructures need to be designed for publishing the privacy-aware event data whose aim is to provide metadata regarding the privacy-related transformations on event data without revealing details of privacy preservation techniques or the protected information. In this paper, we provide formal definitions for the main anonymization operations, used by privacy models in process mining. These are used to create an infrastructure for recording the privacy metadata. We advocate the proposed privacy metadata in practice by designing a privacy extension for the XES standard and a general data structure for event data which are not in the form of standard event logs.