Lina Marsso

CR
h-index7
7papers
51citations
Novelty26%
AI Score34

7 Papers

AIMar 12
Social, Legal, Ethical, Empathetic and Cultural Norm Operationalisation for AI Agents

Radu Calinescu, Ana Cavalcanti, Marsha Chechik et al.

As AI agents are increasingly used in high-stakes domains like healthcare and law enforcement, aligning their behaviour with social, legal, ethical, empathetic, and cultural (SLEEC) norms has become a critical engineering challenge. While international frameworks have established high-level normative principles for AI, a significant gap remains in translating these abstract principles into concrete, verifiable requirements. To address this gap, we propose a systematic SLEEC-norm operationalisation process for determining, validating, implementing, and verifying normative requirements. Furthermore, we survey the landscape of methods and tools supporting this process, and identify key remaining challenges and research avenues for addressing them. We thus establish a framework - and define a research and policy agenda - for developing AI agents that are not only functionally useful but also demonstrably aligned with human norms and values.

CVFeb 29, 2024
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance

Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki et al.

While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality (i.e., from the original image to the full distortion of all perceived visual information), along with two novel human-aware metrics for NN evaluation. To compare VCR of NNs with human perception, we conducted extensive experiments on 14 commonly used image corruptions with 7,718 human participants and state-of-the-art robust NN models with different training objectives (e.g., standard, adversarial, corruption robustness), different architectures (e.g., convolution NNs, vision transformers), and different amounts of training data augmentation. Our study showed that: 1) assessing robustness against continuous corruption can reveal insufficient robustness undetected by existing benchmarks; as a result, 2) the gap between NN and human robustness is larger than previously known; and finally, 3) some image corruptions have a similar impact on human perception, offering opportunities for more cost-effective robustness assessments. Our validation set with 14 image corruptions, human robustness data, and the evaluation code is provided as a toolbox and a benchmark.

SEFeb 8, 2022
If a Human Can See It, So Should Your System: Reliability Requirements for Machine Vision Components

Boyue Caroline Hu, Lina Marsso, Krzysztof Czarnecki et al.

Machine Vision Components (MVC) are becoming safety-critical. Assuring their quality, including safety, is essential for their successful deployment. Assurance relies on the availability of precisely specified and, ideally, machine-verifiable requirements. MVCs with state-of-the-art performance rely on machine learning (ML) and training data but largely lack such requirements. In this paper, we address the need for defining machine-verifiable reliability requirements for MVCs against transformations that simulate the full range of realistic and safety-critical changes in the environment. Using human performance as a baseline, we define reliability requirements as: 'if the changes in an image do not affect a human's decision, neither should they affect the MVC's.' To this end, we provide: (1) a class of safety-related image transformations; (2) reliability requirement classes to specify correctness-preservation and prediction-preservation for MVCs; (3) a method to instantiate machine-verifiable requirements from these requirements classes using human performance experiment data; (4) human performance experiment data for image recognition involving eight commonly used transformations, from about 2000 human participants; and (5) a method for automatically checking whether an MVC satisfies our requirements. Further, we show that our reliability requirements are feasible and reusable by evaluating our methods on 13 state-of-the-art pre-trained image classification models. Finally, we demonstrate that our approach detects reliability gaps in MVCs that other existing methods are unable to detect.

CRApr 28, 2020
Specifying a Cryptographical Protocol in Lustre and SCADE

Lina Marsso

We present SCADE and Lustre models of the Message Authenticator Algorithm (MAA), which is one of the first cryptographic functions for computing a message authentication code. The MAA was adopted between 1987 and 2001, in international standards (ISO 8730 and ISO 8731-2), to ensure the authenticity and integrity of banking transactions. This paper discusses the choices and the challenges of our MAA implementations. Our SCADE and Lustre models validate 201 official test vectors for the MAA.

PLMar 27, 2018
Comparative Study of Eight Formal Specifications of the Message Authenticator Algorithm

Hubert Garavel, Lina Marsso

The Message Authenticator Algorithm (MAA) is one of the first cryptographic functions for computing a Message Authentication Code. Between 1987 and 2001, the MAA was adopted in international standards (ISO 8730 and ISO 8731-2) to ensure the authenticity and integrity of banking transactions. In 1990 and 1991, three formal, yet non-executable, specifications of the MAA (in VDM, Z, and LOTOS) were developed at NPL. Since then, five formal executable specifications of the MAA (in LOTOS, LNT, and term rewrite systems) have been designed at INRIA Grenoble. This article provides an overview of the MAA and compares its formal specifications with respect to common-sense criteria, such as conciseness, readability, and efficiency of code generation.

CRMar 27, 2018
A Formal TLS Handshake Model in LNT

Josip Bozic, Lina Marsso, Radu Mateescu et al.

Testing of network services represents one of the biggest challenges in cyber security. Because new vulnerabilities are detected on a regular basis, more research is needed. These faults have their roots in the software development cycle or because of intrinsic leaks in the system specification. Conformance testing checks whether a system behaves according to its specification. Here model-based testing provides several methods for automated detection of shortcomings. The formal specification of a system behavior represents the starting point of the testing process. In this paper, a widely used cryptographic protocol is specified and tested for conformance with a test execution framework. The first empirical results are presented and discussed.

CRMar 20, 2017
A Large Term Rewrite System Modelling a Pioneering Cryptographic Algorithm

Hubert Garavel, Lina Marsso

We present a term rewrite system that formally models the Message Authenticator Algorithm (MAA), which was one of the first cryptographic functions for computing a Message Authentication Code and was adopted, between 1987 and 2001, in international standards (ISO 8730 and ISO 8731-2) to ensure the authenticity and integrity of banking transactions. Our term rewrite system is large (13 sorts, 18 constructors, 644 non-constructors, and 684 rewrite rules), confluent, and terminating. Implementations in thirteen different languages have been automatically derived from this model and used to validate 200 official test vectors for the MAA.