Majid Rafiei

h-index13

12papers

189citations

Novelty33%

AI Score26

Ranked #163,373 of 201,326 authors (top 81%)#4,780 in CR (top 65%)

12 Papers

LGMar 29, 2023

TraVaG: Differentially Private Trace Variant Generation Using GANs

Majid Rafiei, Frederik Wangelik, Mahsa Pourbafrani et al.

Process mining is rapidly growing in the industry. Consequently, privacy concerns regarding sensitive and private information included in event data, used by process mining algorithms, are becoming increasingly relevant. State-of-the-art research mainly focuses on providing privacy guarantees, e.g., differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques for releasing trace variants still do not fulfill all the requirements of industry-scale usage. Moreover, providing privacy guarantees when there exists a high rate of infrequent trace variants is still a challenge. In this paper, we introduce TraVaG as a new approach for releasing differentially private trace variants based on \text{Generative Adversarial Networks} (GANs) that provides industry-scale benefits and enhances the level of privacy guarantees when there exists a high ratio of infrequent variants. Moreover, TraVaG overcomes shortcomings of conventional privacy preservation techniques such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data show that our approach outperforms state-of-the-art techniques in terms of privacy guarantees, plain data utility preservation, and result utility preservation.

LGOct 4, 2023

Extracting Rules from Event Data for Study Planning

Majid Rafiei, Duygu Bayrak, Mahsa Pourbafrani et al.

In this study, we examine how event data from campus management systems can be used to analyze the study paths of higher education students. The main goal is to offer valuable guidance for their study planning. We employ process and data mining techniques to explore the impact of sequences of taken courses on academic success. Through the use of decision tree models, we generate data-driven recommendations in the form of rules for study planning and compare them to the recommended study plan. The evaluation focuses on RWTH Aachen University computer science bachelor program students and demonstrates that the proposed course sequence features effectively explain academic performance measures. Furthermore, the findings suggest avenues for developing more adaptable study plans.

LGApr 8, 2025

Releasing Differentially Private Event Logs Using Generative Models

Frederik Wangelik, Majid Rafiei, Mahsa Pourbafrani et al.

In recent years, the industry has been witnessing an extended usage of process mining and automated event data analysis. Consequently, there is a rising significance in addressing privacy apprehensions related to the inclusion of sensitive and private information within event data utilized by process mining algorithms. State-of-the-art research mainly focuses on providing quantifiable privacy guarantees, e.g., via differential privacy, for trace variants that are used by the main process mining techniques, e.g., process discovery. However, privacy preservation techniques designed for the release of trace variants are still insufficient to meet all the demands of industry-scale utilization. Moreover, ensuring privacy guarantees in situations characterized by a high occurrence of infrequent trace variants remains a challenging endeavor. In this paper, we introduce two novel approaches for releasing differentially private trace variants based on trained generative models. With TraVaG, we leverage \textit{Generative Adversarial Networks} (GANs) to sample from a privatized implicit variant distribution. Our second method employs \textit{Denoising Diffusion Probabilistic Models} that reconstruct artificial trace variants from noise via trained Markov chains. Both methods offer industry-scale benefits and elevate the degree of privacy assurances, particularly in scenarios featuring a substantial prevalence of infrequent variants. Also, they overcome the shortcomings of conventional privacy preservation techniques, such as bounding the length of variants and introducing fake variants. Experimental results on real-life event data demonstrate that our approaches surpass state-of-the-art techniques in terms of privacy guarantees and utility preservation.

SEMay 6, 2024

Process Variant Analysis Across Continuous Features: A Novel Framework

Ali Norouzifar, Majid Rafiei, Marcus Dees et al.

Extracted event data from information systems often contain a variety of process executions making the data complex and difficult to comprehend. Unlike current research which only identifies the variability over time, we focus on other dimensions that may play a role in the performance of the process. This research addresses the challenge of effectively segmenting cases within operational processes based on continuous features, such as duration of cases, and evaluated risk score of cases, which are often overlooked in traditional process analysis. We present a novel approach employing a sliding window technique combined with the earth mover's distance to detect changes in control flow behavior over continuous dimensions. This approach enables case segmentation, hierarchical merging of similar segments, and pairwise comparison of them, providing a comprehensive perspective on process behavior. We validate our methodology through a real-life case study in collaboration with UWV, the Dutch employee insurance agency, demonstrating its practical applicability. This research contributes to the field by aiding organizations in improving process efficiency, pinpointing abnormal behaviors, and providing valuable inputs for process comparison, and outcome prediction.

SEOct 6, 2021

Trustworthy Artificial Intelligence and Process Mining: Challenges and Opportunities

Andrew Pery, Majid Rafiei, Michael Simon et al.

The premise of this paper is that compliance with Trustworthy AI governance best practices and regulatory frameworks is an inherently fragmented process spanning across diverse organizational units, external stakeholders, and systems of record, resulting in process uncertainties and in compliance gaps that may expose organizations to reputational and regulatory risks. Moreover, there are complexities associated with meeting the specific dimensions of Trustworthy AI best practices such as data governance, conformance testing, quality assurance of AI model behaviors, transparency, accountability, and confidentiality requirements. These processes involve multiple steps, hand-offs, re-works, and human-in-the-loop oversight. In this paper, we demonstrate that process mining can provide a useful framework for gaining fact-based visibility to AI compliance process execution, surfacing compliance bottlenecks, and providing for an automated approach to analyze, remediate and monitor uncertainty in AI regulatory compliance processes.

CRJul 30, 2021

PC4PM: A Tool for Privacy/Confidentiality Preservation in Process Mining

Majid Rafiei, Alexander Schnitzler, Wil M. P. van der Aalst

Process mining enables business owners to discover and analyze their actual processes using event data that are widely available in information systems. Event data contain detailed information which is incredibly valuable for providing insights. However, such detailed data often include highly confidential and private information. Thus, concerns of privacy and confidentiality in process mining are becoming increasingly relevant and new techniques are being introduced. To make the techniques easily accessible, new tools need to be developed to integrate the introduced techniques and direct users to appropriate solutions based on their needs. In this paper, we present a Python-based infrastructure implementing and integrating state-of-the-art privacy/confidentiality preservation techniques in process mining. Our tool provides an easy-to-use web-based user interface for privacy-preserving data publishing, risk analysis, and data utility analysis. The tool also provides a set of anonymization operations that can be utilized to support privacy/confidentiality preservation. The tool manages both standard XES event logs and non-standard event data. We also store and manage privacy metadata to track the changes made by privacy/confidentiality preservation techniques.

CRJun 1, 2021

Privacy and Confidentiality in Process Mining -- Threats and Research Challenges

Gamal Elkoumy, Stephan A. Fahrenkrog-Petersen, Mohammadreza Fani Sani et al.

Privacy and confidentiality are very important prerequisites for applying process mining in order to comply with regulations and keep company secrets. This paper provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to an motivation application scenario in a hospital context as well as to the current body of work on privacy and confidentiality in process mining. A newly developed conceptual model structures the discussion that existing techniques leave room for improvement. This results in a number of important research challenges that should be addressed by future process mining research.

CRMay 25, 2021

Privacy-Preserving Continuous Event Data Publishing