8.3SEApr 24
Inferring Equivalence Classes from Legacy Undocumented Embedded Binaries for ISO 26262-Compliant TestingMarco De Luca, Domenico Francesco De Angelis, Domenico Amalfitano et al.
Equivalence class partitioning is a well-established test design technique mandated by safety standards such as ISO~26262 for systematic testing of safety software. In industrial practice, however, its application to legacy undocumented embedded firmware is often hindered by incomplete or outdated functional specifications. This paper proposes a binary-level methodology for inferring output-oriented equivalence classes directly from compiled firmware, without relying on source-level annotations or external documentation. The approach combines control-flow reconstruction and guided symbolic execution to analyze individual functions and group execution paths according to indistinguishable observable behavior, including return values and output parameters. An optional post-processing step produces human-readable representations to support comprehension and documentation. The methodology is evaluated in an industrial automotive context through a practitioner-based study assessing correctness and interpretability. Results indicate strong alignment with expert expectations and a positive perception of readability and usefulness for supporting function understanding and test design. These findings demonstrate the feasibility and practical relevance of binary-level equivalence class inference for systematic testing of legacy undocumented safety-embedded software.
33.1SEApr 9
CIAO - Code In Architecture Out - Automated Software Architecture Documentation with Large Language ModelsMarco De Luca, Tiziano Santilli, Domenico Amalfitano et al.
Software architecture documentation is essential for system comprehension, yet it is often unavailable or incomplete. While recent LLM-based techniques can generate documentation from code, they typically address local artifacts rather than producing coherent, system-level architectural descriptions. This paper presents a structured process for automatically generating system-level architectural documentation directly from GitHub repositories using Large Language Models. The process, called CIAO (Code In Architecture Out), defines an LLM-based workflow that takes a repository as input and produces system-level architectural documentation following a template derived from ISO/IEC/IEEE 42010, SEI Views \& Beyond, and the C4 model. The resulting documentation can be directly added to the target repository. We evaluated the process through a study with 22 developers, each reviewing the documentation generated for a repository they had contributed to. The evaluation shows that developers generally perceive the produced documentation as valuable, comprehensible, and broadly accurate with respect to the source code, while also highlighting limitations in diagram quality, high-level context modeling, and deployment views. We also assessed the operational cost of the process, finding that generating a complete architectural document requires only a few minutes and is inexpensive to run. Overall, the results indicate that a structured, standards-oriented approach can effectively guide LLMs in producing system-level architectural documentation that is both usable and cost-effective.
SEMar 11, 2015
Toward Reverse Engineering of VBA Based Excel Spreadsheet ApplicationsDomenico Amalfitano, Nicola Amatucci, Vincenzo De Simone et al.
Modern spreadsheet systems can be used to implement complex spreadsheet applications including data sheets, customized user forms and executable procedures written in a scripting language. These applications are often developed by practitioners that do not follow any software engineering practice and do not produce any design documentation. Thus, spreadsheet applications may be very difficult to be maintained or restructured. In this position paper we present in a nutshell two reverse engineering techniques and a tool that we are currently realizing for the abstraction of conceptual data models and business logic models.