6.9SEApr 28
Scenario-based System Testing for Distributed Robotics ApplicationsJan Peleska, Felix Brüning, Wen-Ling Huang et al.
We present the SCenario Specification Language (SCSL) for automated generation and execution of system-level tests. SCSL targets complex distributed systems (e.g., collaborating autonomous robots) where classical model-based testing becomes impractical because (1) the overall system complexity is too high for a single monolithic model, (2) test behaviour cannot be fully precomputed due to substantial nondeterminism in the distributed system under test (SUT), and (3) the SUT configuration may change dynamically at runtime. Challenge (1) is addressed by scenarios: each scenario specifies test-specific expected SUT behaviour and/or stimuli to be applied during execution. Complex system tests are composed from elementary scenarios using sequential and parallel composition. To address (2), the SCSL tool platform supports online (on-the-fly) testing, selecting and executing test steps during runtime. For (3), SCSL provides a collaboration construct that supports dynamic reconfiguration: removing unavailable components, registering newly joining components, and rewiring interfaces during test execution. We illustrate the syntax and semantics of SCSL using a system-test example in which robots perform a salvage mission, and we use an automatically generated test execution to demonstrate the concepts supported by our prototype tool platform.
CVDec 21, 2023
A Stochastic Approach to Classification Error Estimates in Convolutional Neural NetworksJan Peleska, Felix Brüning, Mario Gleirscher et al.
This technical report presents research results achieved in the field of verification of trained Convolutional Neural Network (CNN) used for image classification in safety-critical applications. As running example, we use the obstacle detection function needed in future autonomous freight trains with Grade of Automation (GoA) 4. It is shown that systems like GoA 4 freight trains are indeed certifiable today with new standards like ANSI/UL 4600 and ISO 21448 used in addition to the long-existing standards EN 50128 and EN 50129. Moreover, we present a quantitative analysis of the system-level hazard rate to be expected from an obstacle detection function. It is shown that using sensor/perceptor fusion, the fused detection system can meet the tolerable hazard rate deemed to be acceptable for the safety integrity level to be applied (SIL-3). A mathematical analysis of CNN models is performed which results in the identification of classification clusters and equivalence classes partitioning the image input space of the CNN. These clusters and classes are used to introduce a novel statistical testing method for determining the residual error probability of a trained CNN and an associated upper confidence limit. We argue that this greybox approach to CNN verification, taking into account the CNN model's internal structure, is essential for justifying that the statistical tests have covered the trained CNN with its neurons and inter-layer mappings in a comprehensive way.
SEOct 25, 2021
Complete Agent-driven Model-based System Testing for Autonomous SystemsKerstin I. Eder, Wen-ling Huang, Jan Peleska
In this position paper, a novel approach to testing complex autonomous transportation systems (ATS) in the automotive, avionic, and railway domains is described. It is intended to mitigate some of the most critical problems regarding verification and validation (V&V) effort for ATS. V&V is known to become infeasible for complex ATS, when using conventional methods only. The approach advocated here uses complete testing methods on the module level, because these establish formal proofs for the logical correctness of the software. Having established logical correctness, system-level tests are performed in simulated cloud environments and on the target system. To give evidence that 'sufficiently many' system tests have been performed with the target system, a formally justified coverage criterion is introduced. To optimise the execution of very large system test suites, we advocate an online testing approach where multiple tests are executed in parallel, and test steps are identified on-the-fly. The coordination and optimisation of these executions is achieved by an agent-based approach. Each aspect of the testing approach advocated here is shown to either be consistent with existing standards for development and V&V of safety-critical transportation systems, or it is justified why it should become acceptable in future revisions of the applicable standards.
SEMay 25, 2021
Complete Requirements-based Testing with Finite State MachinesWen-ling Huang, Jan Peleska
In this paper, new contributions to requirements-based testing with deterministic finite state machines are presented. Elementary requirements are specified as triples consisting of a state in the reference model, an input, and the expected reaction of the system under test defined by a set of admissible outputs, allowing for different implementation variants. Composite requirements are specified as collections of elementary ones. Two requirements-driven test generation strategies are introduced, and their fault coverage guarantees are proven. The first is exhaustive in the sense that it produces test suites guaranteeing requirements satisfaction if the test suite is passed. If the test suite execution fails for a given implementation, however, this does not imply that the requirement has been violated. Instead, the failure may indicate an arbitrary violation of I/O-equivalence, which could be unrelated to the requirement under test. The second strategy is complete in the sense that it produces test suites guaranteeing requirements satisfaction if and only if the suite is passed. Complexity considerations indicate that for practical application, the first strategy should be preferred to the second. Typical application scenarios for this approach are safety-critical systems, where safety requirements should be tested with maximal thoroughness, while user requirements might be checked with lesser effort, using conventional testing heuristics.