SE LGMar 29

Large Language Models for Analyzing Enterprise Architecture Debt in Unstructured Documentation

Christin Pagels, Simon Hacks, Rob Henk Bemthuis

arXiv:2604.0004635.4

AI Analysis

For enterprise architects, this work provides an automated method to identify EA debt from previously under-analyzed unstructured documents, though the approach is incremental.

The study proposes using LLMs to detect Enterprise Architecture Smells in unstructured documentation, finding that a custom GPT-based model achieves higher precision and speed, while a fine-tuned on-premise model offers data protection benefits.

Enterprise Architecture Debt (EA Debt) arises from suboptimal design decisions and misaligned components that can degrade an organization's IT landscape over time. Early indicators, Enterprise Architecture Smells (EA Smells), are currently mainly detected manually or only from structured artifacts, leaving much unstructured documentation under-analyzed. This study proposes an approach using a large language model (LLM) to identify and quantify EA Debt in unstructured architectural documentation. Following a design science research approach, we design and evaluate an LLM-based prototype for automated EA Smell detection. The artifact ingests unstructured documents (e.g., process descriptions, strategy papers), applies fine-tuned detection models, and outputs identified smells. We evaluate the prototype through a case study using synthetic yet realistic business documents, benchmarking against a custom GPT-based model. Results show that LLMs can detect multiple predefined EA Smells in unstructured text, with the benchmark model achieving higher precision and processing speed, and the fine-tuned on-premise model offering data protection advantages. The findings highlight opportunities for integrating LLM-based smell detection into EA governance practice.

View on arXiv PDF

Similar