Erin Lanus

h-index6

5papers

55citations

Novelty27%

AI Score20

Ranked #185,706 of 194,257 authors (top 96%)#6,035 in CR (top 89%)

5 Papers

5.5SEOct 10, 2023

Test & Evaluation Best Practices for Machine Learning-Enabled Systems

Jaganmohan Chandrasekaran, Tyler Cody, Nicola McCarthy et al.

Machine learning (ML) - based software systems are rapidly gaining adoption across various domains, making it increasingly essential to ensure they perform as intended. This report presents best practices for the Test and Evaluation (T&E) of ML-enabled software systems across its lifecycle. We categorize the lifecycle of ML-enabled software systems into three stages: component, integration and deployment, and post-deployment. At the component level, the primary objective is to test and evaluate the ML model as a standalone component. Next, in the integration and deployment stage, the goal is to evaluate an integrated ML-enabled system consisting of both ML and non-ML components. Finally, once the ML-enabled software system is deployed and operationalized, the T&E objective is to ensure the system performs as intended. Maintenance activities for ML-enabled software systems span the lifecycle and involve maintaining various assets of ML-enabled software systems. Given its unique characteristics, the T&E of ML-enabled software systems is challenging. While significant research has been reported on T&E at the component level, limited work is reported on T&E in the remaining two stages. Furthermore, in many cases, there is a lack of systematic T&E strategies throughout the ML-enabled system's lifecycle. This leads practitioners to resort to ad-hoc T&E practices, which can undermine user confidence in the reliability of ML-enabled software systems. New systematic testing approaches, adequacy measurements, and metrics are required to address the T&E challenges across all stages of the ML-enabled system lifecycle.

2.9CRMay 29, 2022

Evaluating Automated Driving Planner Robustness against Adversarial Influence

Andres Molina-Markham, Silvia G. Ionescu, Erin Lanus et al.

Evaluating the robustness of automated driving planners is a critical and challenging task. Although methodologies to evaluate vehicles are well established, they do not yet account for a reality in which vehicles with autonomous components share the road with adversarial agents. Our approach, based on probabilistic trust models, aims to help researchers assess the robustness of protections for machine learning-enabled planners against adversarial influence. In contrast with established practices that evaluate safety using the same evaluation dataset for all vehicles, we argue that adversarial evaluation fundamentally requires a process that seeks to defeat a specific protection. Hence, we propose that evaluations be based on estimating the difficulty for an adversary to determine conditions that effectively induce unsafe behavior. This type of inference requires precise statements about threats, protections, and aspects of planning decisions to be guarded. We demonstrate our approach by evaluating protections for planners relying on camera-based object detectors.

7.8LGJan 28, 2022

Systematic Training and Testing for Machine Learning Using Combinatorial Interaction Testing

Tyler Cody, Erin Lanus, Daniel D. Doyle et al.

This paper demonstrates the systematic use of combinatorial coverage for selecting and characterizing test and training sets for machine learning models. The presented work adapts combinatorial interaction testing, which has been successfully leveraged in identifying faults in software testing, to characterize data used in machine learning. The MNIST hand-written digits data is used to demonstrate that combinatorial coverage can be used to select test sets that stress machine learning model performance, to select training sets that lead to robust model performance, and to select data for fine-tuning models to new domains. Thus, the results posit combinatorial coverage as a holistic approach to training and testing for machine learning. In contrast to prior work which has focused on the use of coverage in regard to the internal of neural networks, this paper considers coverage over simple features derived from inputs and outputs. Thus, this paper addresses the case where the supplier of test and training sets for machine learning models does not have intellectual property rights to the models themselves. Finally, the paper addresses prior criticism of combinatorial coverage and provides a rebuttal which advocates the use of coverage metrics in machine learning applications.

3.3SYJan 25, 2021

Test and Evaluation Framework for Multi-Agent Systems of Autonomous Intelligent Agents

Erin Lanus, Ivan Hernandez, Adam Dachowicz et al.

Test and evaluation is a necessary process for ensuring that engineered systems perform as intended under a variety of conditions, both expected and unexpected. In this work, we consider the unique challenges of developing a unifying test and evaluation framework for complex ensembles of cyber-physical systems with embedded artificial intelligence. We propose a framework that incorporates test and evaluation throughout not only the development life cycle, but continues into operation as the system learns and adapts in a noisy, changing, and contended environment. The framework accounts for the challenges of testing the integration of diverse systems at various hierarchical scales of composition while respecting that testing time and resources are limited. A generic use case is provided for illustrative purposes and research directions emerging as a result of exploring the use case via the framework are suggested.

7.2CRMar 16, 2020

Formal Methods Analysis of the Secure Remote Password Protocol

Alan T. Sherman, Erin Lanus, Moses Liskov et al.

We analyze the Secure Remote Password (SRP) protocol for structural weaknesses using the Cryptographic Protocol Shapes Analyzer (CPSA) in the first formal analysis of SRP (specifically, Version 3). SRP is a widely deployed Password Authenticated Key Exchange (PAKE) protocol used in 1Password, iCloud Keychain, and other products. As with many PAKE protocols, two participants use knowledge of a pre-shared password to authenticate each other and establish a session key. SRP aims to resist dictionary attacks, not store plaintext-equivalent passwords on the server, avoid patent infringement, and avoid export controls by not using encryption. Formal analysis of SRP is challenging in part because existing tools provide no simple way to reason about its use of the mathematical expression $v + g^b \mod q$. Modeling $v + g^b$ as encryption, we complete an exhaustive study of all possible execution sequences of SRP. Ignoring possible algebraic attacks, this analysis detects no major structural weakness, and in particular no leakage of any secrets. We do uncover one notable weakness of SRP, which follows from its design constraints. It is possible for a malicious server to fake an authentication session with a client, without the client's participation. This action might facilitate an escalation of privilege attack, if the client has higher privileges than does the server. We conceived of this attack before we used CPSA and confirmed it by generating corresponding execution shapes using CPSA.