52.6SEJun 2
Automated Repair of Requirements for Cyber-Physical Systems in Simulink Requirements TablesAren A. Babikian, Alessio Di Sandro, Federico Formica et al.
The development of complex software systems, e.g., cyber-physical systems (CPSs), involves continuous evolution of both system implementations and their requirements. These two artifacts often proceed independently, creating a risk of misalignment. For example, a system may be updated due to implementation-level concerns, yielding a new version that no longer satisfies its original requirements. Traditional compliance recovery techniques, e.g., automated program repair, address this problem by modifying the system while assuming that requirements are correct. However, faulty, outdated or inadequate requirements are a well-documented challenge in practice, motivating the complementary task of requirement repair. In this paper, we propose a framework that leverages system execution data to repair misaligned CPS requirements, thereby restoring requirement-to-system compliance. Our approach evaluates the correctness of declarative requirements over time-based, real-valued signals expressed using the MATLAB Simulink Requirements Tables language. We evaluate seven variants of our framework on six real-world case studies covering 12 requirements. Results confirm the effectiveness of the proposed framework in producing correct and useful repaired requirements.
42.1SEApr 11
Engineering Resource-constrained Software Systems with DNN Components: a Concept-based Pruning ApproachFederico Formica, Andrea Rota, Aurora Francesca Zanenga et al.
Deep Neural Networks (DNNs) are widely used by engineers to solve difficult problems that require predictive modeling from data. However, these models are often massive, with millions or billions of parameters, and require substantial computational power, RAM, and storage. This becomes a limitation in practical scenarios where strict size and resource constraints must be respected. In this paper, we present a novel concept-based pruning technique for DNNs that guides pruning decisions using human-interpretable concepts, such as features, colors, and classes. This is particularly important in a software engineering context, as DNNs are integrated into systems and must be pruned according to specific system requirements. Our concept-based pruning solution analyzes neuron activations to identify important neurons from a system requirements viewpoint and uses this information to guide the DNN pruning. We assess our solution using the VGG-19 network and a dataset of 26'384 RGB images, focusing on its ability to produce small, effective pruned DNNs and on the computational complexity and performance of these pruned DNNs. We also analyzed the pruning efficiency of our solution and compared alternative configurations. Our results show that concept-based pruning efficiently generates much smaller, effective pruned DNNs. Pruning greatly improves the computational efficiency and performance of DNNs, properties that are particularly useful for practical applications with stringent memory and computational time constraints. Finally, alternative configuration options enable engineers to identify trade-offs adapted to different practical situations.
18.0LGMar 20
Ensembles-based Feature Guided AnalysisFederico Formica, Stefano Gregis, Andrea Rota et al.
Recent Deep Neural Networks (DNN) applications ask for techniques that can explain their behavior. Existing solutions, such as Feature Guided Analysis (FGA), extract rules on their internal behaviors, e.g., by providing explanations related to neurons activation. Results from the literature show that these rules have considerable precision (i.e., they correctly predict certain classes of features), but the recall (i.e., the number of situations these rule apply) is more limited. To mitigate this problem, this paper presents Ensembles-based Feature Guided Analysis (EFGA). EFGA combines rules extracted by FGA into ensembles. Ensembles aggregate different rules to increase their applicability depending on an aggregation criterion, a policy that dictates how to combine rules into ensembles. Although our solution is extensible, and different aggregation criteria can be developed by users, in this work, we considered three different aggregation criteria. We evaluated how the choice of the criterion influences the effectiveness of EFGA on two benchmarks (i.e., the MNIST and LSC datasets), and found that different aggregation criteria offer alternative trade-offs between precision and recall. We then compare EFGA with FGA. For this experiment, we selected an aggregation criterion that provides a reasonable trade-off between precision and recall. Our results show that EFGA has higher train recall (+28.51% on MNIST, +33.15% on LSC), and test recall (+25.76% on MNIST, +30.81% on LSC) than FGA, with a negligible reduction on the test precision (-0.89% on MNIST, -0.69% on LSC).
SEJan 10, 2025
Test Case Generation for Simulink Models: An Experience from the E-Bike DomainMichael Marzella, Andrea Bombarda, Marcello Minervini et al.
Cyber-physical systems development often requires engineers to search for defects in their Simulink models. Search-based software testing (SBST) is a standard technology that supports this activity. To increase practical adaption, industries need empirical evidence of the effectiveness and efficiency of (existing) SBST techniques on benchmarks from different domains and of varying complexity. To address this industrial need, this paper presents our experience assessing the effectiveness and efficiency of SBST in generating failure-revealing test cases for cyber-physical systems requirements. Our study subject is within the electric bike (e-Bike) domain and concerns the software controller of an e-Bike motor, particularly its functional, regulatory, and safety requirements. We assessed the effectiveness and efficiency of HECATE, an SBST framework for Simulink models, to analyze two software controllers. HECATE successfully identified failure-revealing test cases for 83% (30 out of 36) of our experiments. It required, on average, 1 h 17 min 26 s (min = 11 min 56 s, max = 8 h 16 min 22 s, std = 1 h 50 min 34 s) to compute the failure-revealing test cases. The developer of the e-Bike model confirmed the failures identified by HECATE. We present the lessons learned and discuss the relevance of our results for industrial applications, the state of practice improvement, and the results' generalizability.
SESep 19, 2025
Failure Modes and Effects Analysis: An Experience from the E-Bike DomainAndrea Bombarda, Federico Conti, Marcello Minervini et al.
Software failures can have catastrophic and costly consequences. Functional Failure Mode and Effects Analysis (FMEA) is a standard technique used within Cyber-Physical Systems (CPS) to identify software failures and assess their consequences. Simulation-driven approaches have recently been shown to be effective in supporting FMEA. However, industries need evidence of the effectiveness of these approaches to increase practical adoption. This industrial paper presents our experience with using FMEA to analyze the safety of a CPS from the e-Bike domain. We used Simulink Fault Analyzer, an industrial tool that supports engineers with FMEA. We identified 13 realistic faults, modeled them, and analyzed their effects. We sought expert feedback to analyze the appropriateness of our models and the effectiveness of the faults in detecting safety breaches. Our results reveal that for the faults we identified, our models were accurate or contained minor imprecision that we subsequently corrected. They also confirm that FMEA helps engineers improve their models. Specifically, the output provided by the simulation-driven support for 38.4% (5 out of 13) of the faults did not match the engineers' expectations, helping them discover unexpected effects of the faults. We present a thorough discussion of our results and ten lessons learned. Our findings are useful for software engineers who work as Simulink engineers, use the Simulink Fault Analyzer, or work as safety analysts.
SEOct 7, 2019Code
Approximation-Refinement Testing of Compute-Intensive Cyber-Physical Models: An Approach Based on System IdentificationClaudio Menghi, Shiva Nejati, Lionel C. Briand et al.
Black-box testing has been extensively applied to test models of Cyber-Physical systems (CPS) since these models are not often amenable to static and symbolic testing and verification. Black-box testing, however, requires to execute the model under test for a large number of candidate test inputs. This poses a challenge for a large and practically-important category of CPS models, known as compute-intensive CPS (CI-CPS) models, where a single simulation may take hours to complete. We propose a novel approach, namely ARIsTEO, to enable effective and efficient testing of CI-CPS models. Our approach embeds black-box testing into an iterative approximation-refinement loop. At the start, some sampled inputs and outputs of the CI-CPS model under test are used to generate a surrogate model that is faster to execute and can be subjected to black-box testing. Any failure-revealing test identified for the surrogate model is checked on the original model. If spurious, the test results are used to refine the surrogate model to be tested again. Otherwise, the test reveals a valid failure. We evaluated ARIsTEO by comparing it with S-Taliro, an open-source and industry-strength tool for testing CPS models. Our results, obtained based on five publicly-available CPS models, show that, on average, ARIsTEO is able to find 24% more requirements violations than S-Taliro and is 31% faster than S-Taliro in finding those violations. We further assessed the effectiveness and efficiency of ARIsTEO on a large industrial case study from the satellite domain. In contrast to S-Taliro, ARIsTEO successfully tested two different versions of this model and could identify three requirements violations, requiring four hours, on average, for each violation.
LGOct 28, 2025
Feature-Guided Analysis of Neural Networks: A Replication StudyFederico Formica, Stefano Gregis, Aurora Francesca Zanenga et al.
Understanding why neural networks make certain decisions is pivotal for their use in safety-critical applications. Feature-Guided Analysis (FGA) extracts slices of neural networks relevant to their tasks. Existing feature-guided approaches typically monitor the activation of the neural network neurons to extract the relevant rules. Preliminary results are encouraging and demonstrate the feasibility of this solution by assessing the precision and recall of Feature-Guided Analysis on two pilot case studies. However, the applicability in industrial contexts needs additional empirical evidence. To mitigate this need, this paper assesses the applicability of FGA on a benchmark made by the MNIST and LSC datasets. We assessed the effectiveness of FGA in computing rules that explain the behavior of the neural network. Our results show that FGA has a higher precision on our benchmark than the results from the literature. We also evaluated how the selection of the neural network architecture, training, and feature selection affect the effectiveness of FGA. Our results show that the selection significantly affects the recall of FGA, while it has a negligible impact on its precision.
SEJan 6, 2021
Combining Genetic Programming and Model Checking to Generate Environment AssumptionsKhouloud Gaaloul, Claudio Menghi, Shiva Nejati et al.
Software verification may yield spurious failures when environment assumptions are not accounted for. Environment assumptions are the expectations that a system or a component makes about its operational environment and are often specified in terms of conditions over the inputs of that system or component. In this article, we propose an approach to automatically infer environment assumptions for Cyber-Physical Systems (CPS). Our approach improves the state-of-the-art in three different ways: First, we learn assumptions for complex CPS models involving signal and numeric variables; second, the learned assumptions include arithmetic expressions defined over multiple variables; third, we identify the trade-off between soundness and informativeness of environment assumptions and demonstrate the flexibility of our approach in prioritizing either of these criteria. We evaluate our approach using a public domain benchmark of CPS models from Lockheed Martin and a component of a satellite control system from LuxSpace, a satellite system provider. The results show that our approach outperforms state-of-the-art techniques on learning assumptions for CPS models, and further, when applied to our industrial CPS model, our approach is able to learn assumptions that are sufficiently close to the assumptions manually developed by engineers to be of practical value.
SESep 25, 2020
Trace-Checking CPS Properties: Bridging the Cyber-Physical GapClaudio Menghi, Enrico Viganò, Domenico Bianculli et al.
Cyber-physical systems combine software and physical components. Specification-driven trace-checking tools for CPS usually provide users with a specification language to express the requirements of interest, and an automatic procedure to check whether these requirements hold on the execution traces of a CPS. Although there exist several specification languages for CPS, they are often not sufficiently expressive to allow the specification of complex CPS properties related to the software and the physical components and their interactions. In this paper, we propose (i) the Hybrid Logic of Signals (HLS), a logic-based language that allows the specification of complex CPS requirements, and (ii) ThEodorE, an efficient SMT-based trace-checking procedure. This procedure reduces the problem of checking a CPS requirement over an execution trace, to checking the satisfiability of an SMT formula. We evaluated our contributions by using a representative industrial case study in the satellite domain. We assessed the expressiveness of HLS by considering 212 requirements of our case study. HLS could express all the 212 requirements. We also assessed the applicability of ThEodorE by running the trace-checking procedure for 747 trace-requirement combinations. ThEodorE was able to produce a verdict in 74.5% of the cases. Finally, we compared HLS and ThEodorE with other specification languages and trace-checking tools from the literature. Our results show that, from a practical standpoint, our approach offers a better trade-off between expressiveness and performance.
MAJul 22, 2020
Proceedings of the First Workshop on Agents and Robots for reliable Engineered AutonomyRafael C. Cardoso, Angelo Ferrando, Daniela Briola et al.
This volume contains the proceedings of the First Workshop on Agents and Robots for reliable Engineered Autonomy (AREA 2020), co-located with the 24th European Conference on Artificial Intelligence (ECAI 2020). AREA brings together researchers from autonomous agents, software engineering and robotic communities, as combining knowledge coming from these research areas may lead to innovative approaches that solve complex problems related with the verification and validation of autonomous robotic systems.
SEMay 9, 2019
Evaluating Model Testing and Model Checking for Finding Requirements Violations in Simulink ModelsShiva Nejati, Khouloud Gaaloul, Claudio Menghi et al.
Matlab/Simulink is a development and simulation language that is widely used by the Cyber-Physical System (CPS) industry to model dynamical systems. There are two mainstream approaches to verify CPS Simulink models: model testing that attempts to identify failures in models by executing them for a number of sampled test inputs, and model checking that attempts to exhaustively check the correctness of models against some given formal properties. In this paper, we present an industrial Simulink model benchmark, provide a categorization of different model types in the benchmark, describe the recurring logical patterns in the model requirements, and discuss the results of applying model checking and model testing approaches to identify requirements violations in the benchmarked models. Based on the results, we discuss the strengths and weaknesses of model testing and model checking. Our results further suggest that model checking and model testing are complementary and by combining them, we can significantly enhance the capabilities of each of these approaches individually. We conclude by providing guidelines as to how the two approaches can be best applied together.
SEMar 8, 2019
Generating Automated and Online Test Oracles for Simulink Models with Continuous and Uncertain BehaviorsClaudio Menghi, Shiva Nejati, Khouloud Gaaloul et al.
Test automation requires automated oracles to assess test outputs. For cyber physical systems (CPS), oracles, in addition to be automated, should ensure some key objectives: (i) they should check test outputs in an online manner to stop expensive test executions as soon as a failure is detected; (ii) they should handle time- and magnitude-continuous CPS behaviors; (iii) they should provide a quantitative degree of satisfaction or failure measure instead of binary pass/fail outputs; and (iv) they should be able to handle uncertainties due to CPS interactions with the environment. We propose an automated approach to translate CPS requirements specified in a logic-based language into test oracles specified in Simulink -- a widely-used development and simulation language for CPS. Our approach achieves the objectives noted above through the identification of a fragment of Signal First Order logic (SFOL) to specify requirements, the definition of a quantitative semantics for this fragment and a sound translation of the fragment into Simulink. The results from applying our approach on 11 industrial case studies show that: (i) our requirements language can express all the 98 requirements of our case studies; (ii) the time and effort required by our approach are acceptable, showing potentials for the adoption of our work in practice, and (iii) for large models, our approach can dramatically reduce the test execution time compared to when test outputs are checked in an offline manner.
SEJan 7, 2019
Specification Patterns for Robotic MissionsClaudio Menghi, Christos Tsigkanos, Patrizio Pelliccione et al.
Mobile and general-purpose robots increasingly support our everyday life, requiring dependable robotics control software. Creating such software mainly amounts to implementing their complex behaviors known as missions. Recognizing the need, a large number of domain-specific specification languages has been proposed. These, in addition to traditional logical languages, allow the use of formally specified missions for synthesis, verification, simulation, or guiding the implementation. For instance, the logical language LTL is commonly used by experts to specify missions, as an input for planners, which synthesize the behavior a robot should have. Unfortunately, domain-specific languages are usually tied to specific robot models, while logical languages such as LTL are difficult to use by non-experts. We present a catalog of 22 mission specification patterns for mobile robots, together with tooling for instantiating, composing, and compiling the patterns to create mission specifications. The patterns provide solutions for recurrent specification problems, each of which detailing the usage intent, known uses, relationships to other patterns, and---most importantly---a template mission specification in temporal logic. Our tooling produces specifications expressed in the LTL and CTL temporal logics to be used by planners, simulators, or model checkers. The patterns originate from 245 realistic textual mission requirements extracted from the robotics literature, and they are evaluated upon a total of 441 real-world mission requirements and 1251 mission specifications. Five of these reflect scenarios we defined with two well-known industrial partners developing human-size robots. We validated our patterns' correctness with simulators and two real robots.
LOJun 22, 2018
Verifying MITL formulae on Timed Automata considering a Continuous Time SemanticsClaudio Menghi, Marcello Bersani, Matteo Rossi et al.
Timed Automata (TA) is de facto a standard modelling formalism to represent systems when the interest is the analysis of their behaviour as time progresses. This modelling formalism is mostly used for checking whether the behaviours of a system satisfy a set of properties of interest. Even if efficient model-checkers for Timed Automata exist, these tools are not easily configurable. First, they are not designed to easily allow adding new Timed Automata constructs, such as new synchronization mechanisms or communication procedures, but they assume a fixed set of Timed Automata constructs. Second, they usually do not support the full Metric Interval Temporal Logic (MITL) and rely on a precise semantics for the logic in which the property of interest is specified which cannot be easily modified and customized. Finally, they do not easily allow using different solvers that may speed up verification in different contexts. This paper presents a novel technique to perform model checking of full Metric Interval Temporal Logic (MITL) properties on TA. The technique relies on the translation of both the TA and the MITL formula into an intermediate Constraint LTL over clocks (CLTLoc) formula which is verified through an available decision procedure. The technique is flexible since the intermediate logic allows the encoding of new semantics as well as new TA constructs, by just adding new CLTLoc formulae. Furthermore, our technique is not bound to a specific solver as the intermediate CLTLoc formula can be verified using different procedures.