SEMay 15, 2019

Process Mining for Python (PM4Py): Bridging the Gap Between Process- and Data Science

arXiv:1905.06169v1245 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem for data scientists and researchers who need flexible, programmable tools for process mining, though it is incremental as it builds on existing process mining concepts.

The paper tackles the lack of algorithmic customization and integration in existing process mining tools by introducing PM4Py, a Python library that bridges process mining with data science libraries like pandas and scikit-learn, enabling large-scale experimental settings.

Process mining, i.e., a sub-field of data science focusing on the analysis of event data generated during the execution of (business) processes, has seen a tremendous change over the past two decades. Starting off in the early 2000's, with limited to no tool support, nowadays, several software tools, i.e., both open-source, e.g., ProM and Apromore, and commercial, e.g., Disco, Celonis, ProcessGold, etc., exist. The commercial process mining tools provide limited support for implementing custom algorithms. Moreover, both commercial and open-source process mining tools are often only accessible through a graphical user interface, which hampers their usage in large-scale experimental settings. Initiatives such as RapidProM provide process mining support in the scientific workflow-based data science suite RapidMiner. However, these offer limited to no support for algorithmic customization. In the light of the aforementioned, in this paper, we present a novel process mining library, i.e. Process Mining for Python (PM4Py) that aims to bridge this gap, providing integration with state-of-the-art data science libraries, e.g., pandas, numpy, scipy and scikit-learn. We provide a global overview of the architecture and functionality of PM4Py, accompanied by some representative examples of its usage.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes