AIHCAug 19, 2020

Using Sampling Strategy to Assist Consensus Sequence Analysis

arXiv:2008.08300v2
AI Analysis

This work addresses a specific bottleneck in process mining for domain experts, but it is incremental as it builds on existing consensus sequence methods.

The paper tackles the problem of determining how many traces are needed to produce a representative consensus sequence in process mining, proposing a sampling strategy that estimates the difference between an Expert Model and real processes, with results applied to real-world datasets.

Consensus Sequences of event logs are often used in process mining to quickly grasp the core sequence of events to be performed in a process, or to represent the backbone of the process for doing other analyses. However, it is still not clear how many traces are enough to properly represent the underlying process. In this paper, we propose a novel sampling strategy to determine the number of traces necessary to produce a representative consensus sequence. We show how to estimate the difference between the predefined Expert Model and the real processes carried out. This difference level can be used as reference for domain experts to adjust the Expert Model. In addition, we apply this strategy to several real-world workflow activity datasets as a case study. We show a sample curve fitting task to help readers better understand our proposed methodology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes