DBAILGNov 4, 2024

Generating the Traces You Need: A Conditional Generative Model for Process Mining Data

arXiv:2411.02131v15 citationsh-index: 27ICPM
Originality Incremental advance
AI Analysis

This work addresses a limitation in process mining for researchers and practitioners by enabling targeted data generation for specific sub-processes, though it is incremental as it adapts existing CVAE methods to this domain.

The paper tackles the challenge of generating process mining data with specific conditions by introducing a conditional variational autoencoder (CVAE) model, which enables control over trace generation based on control flow and temporal features, evaluated using common and additional metrics.

In recent years, trace generation has emerged as a significant challenge within the Process Mining community. Deep Learning (DL) models have demonstrated accuracy in reproducing the features of the selected processes. However, current DL generative models are limited in their ability to adapt the learned distributions to generate data samples based on specific conditions or attributes. This limitation is particularly significant because the ability to control the type of generated data can be beneficial in various contexts, enabling a focus on specific behaviours, exploration of infrequent patterns, or simulation of alternative 'what-if' scenarios. In this work, we address this challenge by introducing a conditional model for process data generation based on a conditional variational autoencoder (CVAE). Conditional models offer control over the generation process by tuning input conditional variables, enabling more targeted and controlled data generation. Unlike other domains, CVAE for process mining faces specific challenges due to the multiperspective nature of the data and the need to adhere to control-flow rules while ensuring data variability. Specifically, we focus on generating process executions conditioned on control flow and temporal features of the trace, allowing us to produce traces for specific, identified sub-processes. The generated traces are then evaluated using common metrics for generative model assessment, along with additional metrics to evaluate the quality of the conditional generation

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes