16.4DBMar 20
AVOCADO: The Streaming Process Mining ChallengeChristian Imenkamp, Andrea Maldonado, Hendrik Reiter et al.
Streaming process mining deals with the real-time analysis of streaming data. Event streams require algorithms capable of processing data incrementally. To systematically address the complexities of this domain, we propose AVOCADO, a standardized challenge framework that provides clear structural divisions: separating the concept and instantiation layers of challenges in streaming process mining for algorithm evaluation. The AVOCADO evaluates algorithms on streaming-specific metrics like accuracy, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Processing Latency, and robustness. This initiative seeks to foster innovation and community-driven discussions to advance the field of streaming process mining. We present this framework as a foundation and invite the community to contribute to its evolution by suggesting new challenges, such as integrating metrics for system throughput and memory consumption, and expanding the scope to address real-world stream complexities like out-of-order event arrival.
AIAug 10, 2022
A Monitoring and Discovery Approach for Declarative Processes Based on StreamsAndrea Burattin, Hugo A. López, Lasse Starklit
Process discovery is a family of techniques that helps to comprehend processes from their data footprints. Yet, as processes change over time so should their corresponding models, and failure to do so will lead to models that under- or over-approximate behavior. We present a discovery algorithm that extracts declarative processes as Dynamic Condition Response (DCR) graphs from event streams. Streams are monitored to generate temporal representations of the process, later processed to generate declarative models. We validated the technique via quantitative and qualitative evaluations. For the quantitative evaluation, we adopted an extended Jaccard similarity measure to account for process change in a declarative setting. For the qualitative evaluation, we showcase how changes identified by the technique correspond to real changes in an existing process. The technique and the data used for testing are available online.
LGSep 13, 2024
ProcessTBench: An LLM Plan Generation Dataset for Process MiningAndrei Cosmin Redis, Mohammadreza Fani Sani, Bahram Zarrin et al.
Large Language Models (LLMs) have shown significant promise in plan generation. Yet, existing datasets often lack the complexity needed for advanced tool use scenarios - such as handling paraphrased query statements, supporting multiple languages, and managing actions that can be done in parallel. These scenarios are crucial for evaluating the evolving capabilities of LLMs in real-world applications. Moreover, current datasets don't enable the study of LLMs from a process perspective, particularly in scenarios where understanding typical behaviors and challenges in executing the same process under different conditions or formulations is crucial. To address these gaps, we present the ProcessTBench synthetic dataset, an extension of the TaskBench dataset specifically designed to evaluate LLMs within a process mining framework.
CLOct 14, 2024
Skill Learning Using Process Mining for Large Language Model Plan GenerationAndrei Cosmin Redis, Mohammadreza Fani Sani, Bahram Zarrin et al.
Large language models (LLMs) hold promise for generating plans for complex tasks, but their effectiveness is limited by sequential execution, lack of control flow models, and difficulties in skill retrieval. Addressing these issues is crucial for improving the efficiency and interpretability of plan generation as LLMs become more central to automation and decision-making. We introduce a novel approach to skill learning in LLMs by integrating process mining techniques, leveraging process discovery for skill acquisition, process models for skill storage, and conformance checking for skill retrieval. Our methods enhance text-based plan generation by enabling flexible skill discovery, parallel execution, and improved interpretability. Experimental results suggest the effectiveness of our approach, with our skill retrieval method surpassing state-of-the-art accuracy baselines under specific conditions.
CLAug 13, 2025
A Framework for Processing Textual Descriptions of Business Processes using a Constrained Language -- Technical ReportAndrea Burattin, Antonio Grama, Ana-Maria Sima et al.
This report explores how (potentially constrained) natural language can be used to enable non-experts to develop process models by simply describing scenarios in plain text. To this end, a framework, called BeePath, is proposed. It allows users to write process descriptions in a constrained pattern-based language, which can then be translated into formal models such as Petri nets and DECLARE. The framework also leverages large language models (LLMs) to help convert unstructured descriptions into this constrained language.
AIJan 23, 2022
Online Soft Conformance Checking: Any Perspective Can Indicate DeviationsAndrea Burattin
Within process mining, a relevant activity is conformance checking. Such activity consists of establishing the extent to which actual executions of a process conform the expected behavior of a reference model. Current techniques focus on prescriptive models of the control-flow as references. In certain scenarios, however, a prescriptive model might not be available and, additionally, the control-flow perspective might not be ideal for this purpose. This paper tackles these two problems by suggesting a conformance approach that uses a descriptive model (i.e., a pattern of the observed behavior over a certain amount of time) which is not necessarily referring to the control-flow (e.g., it can be based on the social network of handover of work). Additionally, the entire approach can work both offline and online, thus providing feedback in real time. The approach, which is implemented in ProM, has been tested and results from 3 experiments with real world as well as synthetic data are reported.
AIFeb 24, 2016
Time and Activity Sequence Prediction of Business Process InstancesMirko Polato, Alessandro Sperduti, Andrea Burattin et al.
The ability to know in advance the trend of running process instances, with respect to different features, such as the expected completion time, would allow business managers to timely counteract to undesired situations, in order to prevent losses. Therefore, the ability to accurately predict future features of running business process instances would be a very helpful aid when managing processes, especially under service level agreement constraints. However, making such accurate forecasts is not easy: many factors may influence the predicted features. Many approaches have been proposed to cope with this problem but all of them assume that the underling process is stationary. However, in real cases this assumption is not always true. In this work we present new methods for predicting the remaining time of running cases. In particular we propose a method, assuming process stationarity, which outperforms the state-of-the-art and two other methods which are able to make predictions even with non-stationary processes. We also describe an approach able to predict the full sequence of activities that a running case is going to take. All these methods are extensively evaluated on two real case studies.
SEFeb 9, 2016
Detection and Quantification of Flow Consistency in Business Process ModelsAndrea Burattin, Vered Bernstein, Manuel Neurauter et al.
Business process models abstract complex business processes by representing them as graphical models. Their layout, solely determined by the modeler, affects their understandability. To support the construction of understandable models it would be beneficial to systematically study this effect. However, this requires a basic set of measurable key visual features, depicting the layout properties that are meaningful to the human user. The aim of this research is thus twofold. First, to empirically identify key visual features of business process models which are perceived as meaningful to the user. Second, to show how such features can be quantified into computational metrics, which are applicable to business process models. We focus on one particular feature, consistency of flow direction, and show the challenges that arise when transforming it into a precise metric. We propose three different metrics addressing these challenges, each following a different view of flow consistency. We then report the results of an empirical evaluation, which indicates which metric is more effective in predicting the human perception of this feature. Moreover, two other automatic evaluations describing the performance and the computational capabilities of our metrics are reported as well.
SEJun 28, 2015
PLG2: Multiperspective Processes Randomization and Simulation for Online and Offline SettingsAndrea Burattin
Process mining represents an important field in BPM and data mining research. Recently, it has gained importance also for practitioners: more and more companies are creating business process intelligence solutions. The evaluation of process mining algorithms requires, as any other data mining task, the availability of large amount of real-world data. Despite the increasing availability of such datasets, they are affected by many limitations, in primis the absence of a "gold standard" (i.e., the reference model). This paper extends an approach, already available in the literature, for the generation of random processes. Novelties have been introduced throughout the work and, in particular, they involve the complete support for multiperspective models and logs (i.e., the control-flow perspective is enriched with time and data information) and for online settings (i.e., generation of multiperspective event streams and concept drifts). The proposed new framework is able to almost entirely cover the spectrum of possible scenarios that can be observed in the real-world. The proposed approach is implemented as a publicly available Java application, with a set of APIs for the programmatic execution of experiments.
SEMar 17, 2015
Conformance Checking Based on Multi-Perspective Declarative Process ModelsAndrea Burattin, Fabrizio Maria Maggi, Alessandro Sperduti
Process mining is a family of techniques that aim at analyzing business process execution data recorded in event logs. Conformance checking is a branch of this discipline embracing approaches for verifying whether the behavior of a process, as recorded in a log, is in line with some expected behaviors provided in the form of a process model. The majority of these approaches require the input process model to be procedural (e.g., a Petri net). However, in turbulent environments, characterized by high variability, the process behavior is less stable and predictable. In these environments, procedural process models are less suitable to describe a business process. Declarative specifications, working in an open world assumption, allow the modeler to express several possible execution paths as a compact set of constraints. Any process execution that does not contradict these constraints is allowed. One of the open challenges in the context of conformance checking with declarative models is the capability of supporting multi-perspective specifications. In this paper, we close this gap by providing a framework for conformance checking based on MP-Declare, a multi-perspective version of the declarative process modeling language Declare. The approach has been implemented in the process mining tool ProM and has been experimented in three real life case studies.
SIJun 12, 2014
SocialSpy: Browsing (Supposedly) Hidden Information in Online Social NetworksAndrea Burattin, Giuseppe Cascavilla, Mauro Conti
Online Social Networks are becoming the most important "places" where people share information about their lives. With the increasing concern that users have about privacy, most social networks offer ways to control the privacy of the user. Unfortunately, we believe that current privacy settings are not as effective as users might think. In this paper, we highlight this problem focusing on one of the most popular social networks, Facebook. In particular, we show how easy it is to retrieve information that a user might have set as (and hence thought as) "private". As a case study, we focus on retrieving the list of friends for users that did set this information as "hidden" (to non-friends). We propose four different strategies to achieve this goal, and we evaluate them. The results of our thorough experiments show the feasibility of our strategies as well as their effectiveness: our approach is able to retrieve a significant percentage of the names of the "hidden" friends: i.e., some 25% on average, and more than 70% for some users.