LGMay 28, 2025

Rethinking BPS: A Utility-Based Evaluation Framework

Konrad Özdemir, Lukas Kirchdorfer, Keyvan Amiri Elyasi, Han van der Aa, Heiner Stuckenschmidt

arXiv:2505.22316v14.1h-index: 7

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for researchers and practitioners in business process management by providing a more meaningful evaluation framework, though it appears incremental as it builds on existing BPS evaluation approaches.

The paper tackles the problem of evaluating business process simulation (BPS) models by addressing limitations in current methods, such as reliance on forecasting and Earth Mover's Distance metrics, and proposes a novel framework that assesses simulation quality based on generating representative process behavior, showing it helps identify discrepancies and distinguish between model accuracy and data complexity.

Business process simulation (BPS) is a key tool for analyzing and optimizing organizational workflows, supporting decision-making by estimating the impact of process changes. The reliability of such estimates depends on the ability of a BPS model to accurately mimic the process under analysis, making rigorous accuracy evaluation essential. However, the state-of-the-art approach to evaluating BPS models has two key limitations. First, it treats simulation as a forecasting problem, testing whether models can predict unseen future events. This fails to assess how well a model captures the as-is process, particularly when process behavior changes from train to test period. Thus, it becomes difficult to determine whether poor results stem from an inaccurate model or the inherent complexity of the data, such as unpredictable drift. Second, the evaluation approach strongly relies on Earth Mover's Distance-based metrics, which can obscure temporal patterns and thus yield misleading conclusions about simulation quality. To address these issues, we propose a novel framework that evaluates simulation quality based on its ability to generate representative process behavior. Instead of comparing simulated logs to future real-world executions, we evaluate whether predictive process monitoring models trained on simulated data perform comparably to those trained on real data for downstream analysis tasks. Empirical results show that our framework not only helps identify sources of discrepancies but also distinguishes between model accuracy and data complexity, offering a more meaningful way to assess BPS quality.

View on arXiv PDF

Similar