SEApr 13

Automated BPMN Model Generation from Textual Process Descriptions: A Multi-Stage LLM-Driven Approach

arXiv:2604.1210516.52 citationsh-index: 7
Predicted impact top 42% in SE · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners needing to convert unstructured process descriptions into executable BPMN models, this work provides a scalable automated approach, though it is incremental over existing LLM-based methods.

The paper presents a multi-stage LLM-driven pipeline that automatically generates BPMN models from textual process descriptions, achieving average reconstruction similarity above 0.75 on 387 validated ground-truth models from 750 public diagrams, with about 50 near-perfect reconstructions.

Automatically reconstructing BPMN models from unstructured natural-language descriptions remains challenging due to heterogeneous modeling conventions, multilingual sources, and the lack of reliable ground truth. We present a scalable, multi-stage LLM-driven pipeline that automates both ground-truth construction and model reconstruction. Multilingual BPMN XML files are translated into English, validated using execution-oriented compliance checks in SpiffWorkflow, and iteratively repaired through targeted LLM-guided corrections to produce a consistent ground-truth corpus. From these validated models, process descriptions are generated and used to reconstruct executable BPMN~2.0 XML diagrams without manual curation. We introduce a multi-dimensional similarity framework combining structural metrics, type-distribution alignment, and embedding-based semantic measures. In an empirical study of 750 public BPMN diagrams, the pipeline generated 387 validated ground-truth models and achieved average reconstruction similarity above 0.75, including approximately 50 near-perfect reconstructions differing only in minor naming variations. The results demonstrate that LLMs can generate structurally compliant and semantically meaningful BPMN diagrams at scale.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes