AILGMar 26, 2020

Adversarial System Variant Approximation to Quantify Process Model Generalization

arXiv:2003.12168v211 citations
AI Analysis

This addresses a gap in process mining for assessing model generalization, which is incremental as it builds on existing quality metrics and deep learning techniques.

The paper tackles the problem of quantifying how well a process model generalizes to unobserved system behavior in process mining, proposing AVATAR, a deep learning-based method that approximates variant distributions and samples realistic unobserved variants, demonstrating significant performance improvements in controlled experiments on 15 ground truth systems.

In process mining, process models are extracted from event logs using process discovery algorithms and are commonly assessed using multiple quality dimensions. While the metrics that measure the relationship of an extracted process model to its event log are well-studied, quantifying the level by which a process model can describe the unobserved behavior of its underlying system falls short in the literature. In this paper, a novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue. Sequence Generative Adversarial Networks are trained on the variants contained in an event log with the intention to approximate the underlying variant distribution of the system behavior. Unobserved realistic variants are sampled either directly from the Sequence Generative Adversarial Network or by leveraging the Metropolis-Hastings algorithm. The degree by which a process model relates to its underlying unknown system behavior is then quantified based on the realistic observed and estimated unobserved variants using established process model quality metrics. Significant performance improvements in revealing realistic unobserved variants are demonstrated in a controlled experiment on 15 ground truth systems. Additionally, the proposed methodology is experimentally tested and evaluated to quantify the generalization of 60 discovered process models with respect to their systems.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes