CLJul 18, 2024Code
PM-LLM-Benchmark: Evaluating Large Language Models on Process Mining TasksAlessandro Berti, Humam Kourani, Wil M. P. van der Aalst
Large Language Models (LLMs) have the potential to semi-automate some process mining (PM) analyses. While commercial models are already adequate for many analytics tasks, the competitive level of open-source LLMs in PM tasks is unknown. In this paper, we propose PM-LLM-Benchmark, the first comprehensive benchmark for PM focusing on domain knowledge (process-mining-specific and process-specific) and on different implementation strategies. We focus also on the challenges in creating such a benchmark, related to the public availability of the data and on evaluation biases by the LLMs. Overall, we observe that most of the considered LLMs can perform some process mining tasks at a satisfactory level, but tiny models that would run on edge devices are still inadequate. We also conclude that while the proposed benchmark is useful for identifying LLMs that are adequate for process mining tasks, further research is needed to overcome the evaluation biases and perform a more thorough ranking of the competitive LLMs.
AIAug 30, 2024
Bridging Domain Knowledge and Process Discovery Using Large Language ModelsAli Norouzifar, Humam Kourani, Marcus Dees et al.
Discovering good process models is essential for different process analysis tasks such as conformance checking and process improvements. Automated process discovery methods often overlook valuable domain knowledge. This knowledge, including insights from domain experts and detailed process documentation, remains largely untapped during process discovery. This paper leverages Large Language Models (LLMs) to integrate such knowledge directly into process discovery. We use rules derived from LLMs to guide model construction, ensuring alignment with both domain knowledge and actual process executions. By integrating LLMs, we create a bridge between process knowledge expressed in natural language and the discovery of robust process models, advancing process discovery methodologies significantly. To showcase the usability of our framework, we conducted a case study with the UWV employee insurance agency, demonstrating its practical benefits and effectiveness.
20.9AIMay 23
Beyond Control-Flow: Integrating the Resource Perspective into Multi-Collaborative Process Modeling from TextAnton Antonov, Humam Kourani, Alessandro Berti et al.
Process modeling is a sub-domain of Business Process Management (BPM) focused on the translation of process artifacts into formal models. This task traditionally requires extensive human input and domain expertise in both BPM notations and the specific business context. While Large Language Models (LLMs) can now automate much of this manual work, current text-to-model approaches focus predominantly on the control-flow perspective-ordering activities without considering the collaborative aspect of the processes. In this paper, we introduce a resource-aware generation pipeline that produces formal BPMN 2.0 collaboration diagrams from natural-language descriptions. Rather than solely prompting an LLM for raw XML, we describe a compact, executable intermediate language with mandatory resource details defining both the organization (pool) and the role (lane). Cross-organization dependencies are materialized using the standard formal notation for such interactions-message events-while an orthogonal layout routine automatically handles the spatial arrangement of elements within pools and lanes. Experiments on ten business processes with nine LLMs show strong resource discovery while preserving control-flow quality and adding only marginal runtime overhead. This approach moves generative modeling toward a more comprehensive, multi-collaborative representation of business operations.
DBAug 8, 2024
Leveraging Large Language Models for Enhanced Process Model ComprehensionHumam Kourani, Alessandro Berti, Jasmin Hennrich et al.
In Business Process Management (BPM), effectively comprehending process models is crucial yet poses significant challenges, particularly as organizations scale and processes become more complex. This paper introduces a novel framework utilizing the advanced capabilities of Large Language Models (LLMs) to enhance the interpretability of complex process models. We present different methods for abstracting business process models into a format accessible to LLMs, and we implement advanced prompting strategies specifically designed to optimize LLM performance within our framework. Additionally, we present a tool, AIPA, that implements our proposed framework and allows for conversational process querying. We evaluate our framework and tool by i) an automatic evaluation comparing different LLMs, model abstractions, and prompting strategies and ii) a user study designed to assess AIPA's effectiveness comprehensively. Results demonstrate our framework's ability to improve the accessibility and interpretability of process models, pioneering new pathways for integrating AI technologies into the BPM field.
41.3DBApr 20
Hierarchical Decomposition of Separable Workflow-NetsHumam Kourani, Gyunam Park, Wil M. P. van der Aalst
The Partially Ordered Workflow Language (POWL) has recently emerged as a process modeling notation, offering strong quality guarantees and high expressiveness. While early versions of POWL relied on strict block-structured operators for choices and loops, the language has recently evolved into POWL 2.0, introducing choice graphs to enable the modeling of non-block-structured decisions and cycles. To bridge the gap between the theoretical advantages of POWL and the practical need for compatibility with established notations, robust model transformations are required. This paper presents a novel algorithm for transforming safe and sound workflow nets (WF-nets) into equivalent POWL 2.0 models. The algorithm recursively identifies structural patterns within the WF-net and translates them into their POWL representation. Unlike the previous approach that required separate detection strategies for exclusive choices and loops, our new algorithm utilizes choice graphs to capture generalized decision and cyclic patterns. We formally prove the correctness of our approach, showing that the generated POWL model preserves the language of the input WF-net. Furthermore, we prove the completeness of our algorithm on the class of separable WF-nets, which corresponds to nets constructed via the hierarchical nesting of state machines and marked graphs. We evaluate our algorithm on large-scale process models to demonstrate its high scalability. Furthermore, to test its practical expressiveness, we applied it to a benchmark of 1,493 industrial and synthetic process models. Our algorithm successfully transformed all models in this benchmark, suggesting that POWL 2.0's expressive power is generally sufficient to capture the complex logic found in real-world business processes. This work paves the way for broader adoption of POWL in practical process analysis and improvement applications.
19.1DBApr 20
Revealing Inherent Concurrency in Event Data: A Partial Order Approach to Process DiscoveryHumam Kourani, Gyunam Park, Wil M. P. van der Aalst
Process discovery algorithms traditionally linearize events, failing to capture the inherent concurrency of real-world processes. While some techniques can handle partially ordered data, they often struggle with scalability on large event logs. We introduce a novel, scalable algorithm that directly leverages partial orders in process discovery. Our approach derives partially ordered traces from event data and aggregates them into a sound-by-construction, perfectly fitting process model. Our hierarchical algorithm preserves inherent concurrency while systematically abstracting exclusive choices and loop patterns, enhancing model compactness and precision. We have implemented our technique and demonstrated its applicability on complex real-life event logs. Our work contributes a scalable solution for a more faithful representation of process behavior, especially when concurrency is prevalent in event data.
37.8AIMar 16
PMAx: An Agentic Framework for AI-Driven Process MiningAnton Antonov, Humam Kourani, Alessandro Berti et al.
Process mining provides powerful insights into organizational workflows, but extracting these insights typically requires expertise in specialized query languages and data science tools. Large Language Models (LLMs) offer the potential to democratize process mining by enabling business users to interact with process data through natural language. However, using LLMs as direct analytical engines over raw event logs introduces fundamental challenges: LLMs struggle with deterministic reasoning and may hallucinate metrics, while sending large, sensitive logs to external AI services raises serious data-privacy concerns. To address these limitations, we present PMAx, an autonomous agentic framework that functions as a virtual process analyst. Rather than relying on LLMs to generate process models or compute analytical results, PMAx employs a privacy-preserving multi-agent architecture. An Engineer agent analyzes event-log metadata and autonomously generates local scripts to run established process mining algorithms, compute exact metrics, and produce artifacts such as process models, summary tables, and visualizations. An Analyst agent then interprets these insights and artifacts to compile comprehensive reports. By separating computation from interpretation and executing analysis locally, PMAx ensures mathematical accuracy and data privacy while enabling non-technical users to transform high-level business questions into reliable process insights.
DBMar 7, 2024
ProMoAI: Process Modeling with Generative AIHumam Kourani, Alessandro Berti, Daniel Schuster et al.
ProMoAI is a novel tool that leverages Large Language Models (LLMs) to automatically generate process models from textual descriptions, incorporating advanced prompt engineering, error handling, and code generation techniques. Beyond automating the generation of complex process models, ProMoAI also supports process model optimization. Users can interact with the tool by providing feedback on the generated model, which is then used for refining the process model. ProMoAI utilizes the capabilities LLMs to offer a novel, AI-driven approach to process modeling, significantly reducing the barrier to entry for users without deep technical knowledge in process modeling.
AIOct 8, 2025
Integrating Domain Knowledge into Process Discovery Using Large Language ModelsAli Norouzifar, Humam Kourani, Marcus Dees et al.
Process discovery aims to derive process models from event logs, providing insights into operational behavior and forming a foundation for conformance checking and process improvement. However, models derived solely from event data may not accurately reflect the real process, as event logs are often incomplete or affected by noise, and domain knowledge, an important complementary resource, is typically disregarded. As a result, the discovered models may lack reliability for downstream tasks. We propose an interactive framework that incorporates domain knowledge, expressed in natural language, into the process discovery pipeline using Large Language Models (LLMs). Our approach leverages LLMs to extract declarative rules from textual descriptions provided by domain experts. These rules are used to guide the IMr discovery algorithm, which recursively constructs process models by combining insights from both the event log and the extracted rules, helping to avoid problematic process structures that contradict domain knowledge. The framework coordinates interactions among the LLM, domain experts, and a set of backend services. We present a fully implemented tool that supports this workflow and conduct an extensive evaluation of multiple LLMs and prompt engineering strategies. Our empirical study includes a case study based on a real-life event log with the involvement of domain experts, who assessed the usability and effectiveness of the framework.
AISep 18, 2025
Knowledge-Driven Hallucination in Large Language Models: An Empirical Study on Process ModelingHumam Kourani, Anton Antonov, Alessandro Berti et al.
The utility of Large Language Models (LLMs) in analytical tasks is rooted in their vast pre-trained knowledge, which allows them to interpret ambiguous inputs and infer missing information. However, this same capability introduces a critical risk of what we term knowledge-driven hallucination: a phenomenon where the model's output contradicts explicit source evidence because it is overridden by the model's generalized internal knowledge. This paper investigates this phenomenon by evaluating LLMs on the task of automated process modeling, where the goal is to generate a formal business process model from a given source artifact. The domain of Business Process Management (BPM) provides an ideal context for this study, as many core business processes follow standardized patterns, making it likely that LLMs possess strong pre-trained schemas for them. We conduct a controlled experiment designed to create scenarios with deliberate conflict between provided evidence and the LLM's background knowledge. We use inputs describing both standard and deliberately atypical process structures to measure the LLM's fidelity to the provided evidence. Our work provides a methodology for assessing this critical reliability issue and raises awareness of the need for rigorous validation of AI-generated artifacts in any evidence-based domain.
AIMay 11, 2025
Unlocking Non-Block-Structured Decisions: Inductive Mining with Choice GraphsHumam Kourani, Gyunam Park, Wil M. P. van der Aalst
Process discovery aims to automatically derive process models from event logs, enabling organizations to analyze and improve their operational processes. Inductive mining algorithms, while prioritizing soundness and efficiency through hierarchical modeling languages, often impose a strict block-structured representation. This limits their ability to accurately capture the complexities of real-world processes. While recent advancements like the Partially Ordered Workflow Language (POWL) have addressed the block-structure limitation for concurrency, a significant gap remains in effectively modeling non-block-structured decision points. In this paper, we bridge this gap by proposing an extension of POWL to handle non-block-structured decisions through the introduction of choice graphs. Choice graphs offer a structured yet flexible approach to model complex decision logic within the hierarchical framework of POWL. We present an inductive mining discovery algorithm that uses our extension and preserves the quality guarantees of the inductive mining framework. Our experimental evaluation demonstrates that the discovered models, enriched with choice graphs, more precisely represent the complex decision-making behavior found in real-world processes, without compromising the high scalability inherent in inductive mining techniques.