A framework for mining process models from emails logs
This work addresses the need for companies and institutions to understand and reengineer undocumented processes from email data, representing an incremental advancement in process mining from email logs.
The paper tackles the problem of extracting undocumented business processes from email logs by proposing a new method that uses unsupervised machine learning with minimal human involvement, achieving semi-automatic labeling of emails for activity recognition in a real-world dataset with two process models.
Due to its wide use in personal, but most importantly, professional contexts, email represents a valuable source of information that can be harvested for understanding, reengineering and repurposing undocumented business processes of companies and institutions. Towards this aim, a few researchers investigated the problem of extracting process oriented information from email logs in order to take benefit of the many available process mining techniques and tools. In this paper we go further in this direction, by proposing a new method for mining process models from email logs that leverage unsupervised machine learning techniques with little human involvement. Moreover, our method allows to semi-automatically label emails with activity names, that can be used for activity recognition in new incoming emails. A use case demonstrates the usefulness of the proposed solution using a modest in size, yet real-world, dataset containing emails that belong to two different process models.