Sophie Corallo

2papers

2 Papers

21.6SEMar 19
Who's Who? LLM-assisted Software Traceability with Architecture Entity Recognition

Dominik Fuchß, Haoyu Liu, Sophie Corallo et al.

Identifying architecturally relevant entities in textual artifacts is crucial for Traceability Link Recovery (TLR) between Software Architecture Documentation (SAD) and source code. While Software Architecture Models (SAMs) can bridge the semantic gap between these artifacts, their manual creation is time-consuming. LLMs offer new capabilities for extracting architectural entities from SAD and source code to construct SAMs automatically or establish direct trace links. This paper extends our ICSA 2025 paper [19], which introduced Extracting Architecture (ExArch) for LLM-based architecture component name extraction. The extension contributes the novel Architecture Traceability with Entity Matching via Semantic inference (ArTEMiS) approach, an extended evaluation with additional LLMs, configurations, a revised benchmark, and a combined evaluation of both approaches. Specifically, this paper presents the following approaches: ExArch extracts component names as simple SAMs from SAD and source code to eliminate the need for manual SAM creation, while ArTEMiS identifies architectural entities in documentation and matches them with (manually or automatically generated) SAM entities. Our evaluation compares against state-of-the-art approaches SWATTR, TransArC and ArDoCode. TransArC achieves strong performance (F1: 0.87) but requires manually created SAMs; ExArch achieves comparable results (F1: 0.86) using only SAD and code. ArTEMiS is on par with the traditional heuristic-based SWATTR (F1: 0.81) and can successfully replace it when integrated with TransArC. The combination of ArTEMiS and ExArch outperforms ArDoCode, the best baseline without manual SAMs. Our results demonstrate that LLMs can effectively identify architectural entities in textual artifacts, enabling automated SAM generation and TLR, making architecture-code traceability more practical and accessible.

17.7CRMay 8
Can I Check What I Designed? Mapping Security Design DSLs to Code Analyzers

Sven Peldszus, Frederik Reiche, Kevin Hermann et al.

When assessing the potential impact of code-level vulnerabilities, e.g., discovered by automated analyzers, it is essential to consider them in the context of the system's security design. However, this is a challenging task due to the abstraction gap between security design, often specified using security DSLs, and implementation. As we will show, even security experts lack a complete understanding of this relationship. Intrigued by this gap (and the general disconnect between secure design and secure implementation) we present a study of 66 design-level security DSLs and 559 security checks from 36 code-level analyzers. We identify what concepts are common to both and capture them in the SecLan model, which has been validated by 22 security experts. Based on this, we investigate the relationship between DSLs and analyzers quantitatively and explore it qualitatively together with 9 security experts. We learn that there are few commonalities between design-level and implementation-level security; security checks are often described by overly general weaknesses, resulting in many non-obvious potential relationships between security DSLs and analyzers; and even security experts are overwhelmed by this complexity. We provide an empirical basis that helps practitioners and researchers better understand the gap and serves as a first step toward bridging it.