CLAug 24, 2024Code
Are LLM-based methods good enough for detecting unfair terms of service?Mirgita Frasheri, Arian Bakhtiarnia, Lukas Esterle et al.
Countless terms of service (ToS) are being signed everyday by users all over the world while interacting with all kinds of apps and websites. More often than not, these online contracts spanning double-digit pages are signed blindly by users who simply want immediate access to the desired service. What would normally require a consultation with a legal team, has now become a mundane activity consisting of a few clicks where users potentially sign away their rights, for instance in terms of their data privacy, to countless online entities/companies. Large language models (LLMs) are good at parsing long text-based documents, and could potentially be adopted to help users when dealing with dubious clauses in ToS and their underlying privacy policies. To investigate the utility of existing models for this task, we first build a dataset consisting of 12 questions applied individually to a set of privacy policies crawled from popular websites. Thereafter, a series of open-source as well as commercial chatbots such as ChatGPT, are queried over each question, with the answers being compared to a given ground truth. Our results show that some open-source models are able to provide a higher accuracy compared to some commercial models. However, the best performance is recorded from a commercial chatbot (ChatGPT4). Overall, all models perform only slightly better than random at this task. Consequently, their performance needs to be significantly improved before they can be adopted at large for this purpose.
10.8ROMay 4
Human-in-the-Loop Uncertainty Analysis in Self-Adaptive Robots Using LLMsHassan Sartaj, Jalil Boudjadar, Mirgita Frasheri et al.
Self-adaptive robots operate in dynamic, unpredictable environments where unaddressed uncertainties can lead to safety violations and operational failures. However, systematically identifying and analyzing these uncertainties, including their sources, impacts, and mitigation strategies, remains a significant challenge given the inherent complexity of real-world environments, dynamic robotic behavior, and the rapid evolution of robotic technologies. To address this, we introduce RoboULM, a human-in-the-loop methodology and tool that supports practitioners in systematically exploring uncertainties at the design stage using large language models (LLMs). Moreover, we present an uncertainty taxonomy that provides a detailed catalog of uncertainties in self-adaptive robots. We evaluated RoboULM with 16 practitioners from four industrial use cases. The results show that RoboULM was perceived as both useful and easy to understand, with the participants particularly valuing structured prompting and iterative refinement support. These findings demonstrate the potential of RoboULM as a viable solution for systematic uncertainty analysis in complex robots.
ROJul 2, 2021
RMQFMU: Bridging the Real World with Co-simulation Technical ReportMirgita Frasheri, Henrik Ejersbo, Casper Thule et al.
In this paper we present an experience report for the RMQFMU, a plug and play tool, that enables feeding data to/from an FMI2-based co-simulation environment based on the AMQP protocol. Bridging the co-simulation to an external environment allows on one side to feed historical data to the co-simulation, serving different purposes, such as visualisation and/or data analysis. On the other side, such a tool facilitates the realisation of the digital twin concept by coupling co-simulation and hardware/robots close to real-time. In the paper we present limitations of the initial version of the RMQFMU with respect to the capability of bridging co-simulation with the real world. To provide the desired functionality of the tool, we present in a step-by-step fashion how these limitations, and subsequent limitations, are alleviated. We perform various experiments in order to give reason to the modifications carried out. Finally, we report on two case-studies where we have adopted the RMQFMU, and provide guidelines meant to aid practitioners in its use.
SEJun 30, 2021
Ethical AI-Powered Regression Test SelectionPer Erik Strandberg, Mirgita Frasheri, Eduard Paul Enoiu
Test automation is common in software development; often one tests repeatedly to identify regressions. If the amount of test cases is large, one may select a subset and only use the most important test cases. The regression test selection (RTS) could be automated and enhanced with Artificial Intelligence (AI-RTS). This however could introduce ethical challenges. While such challenges in AI are in general well studied, there is a gap with respect to ethical AI-RTS. By exploring the literature and learning from our experiences of developing an industry AI-RTS tool, we contribute to the literature by identifying three challenges (assigning responsibility, bias in decision-making and lack of participation) and three approaches (explicability, supervision and diversity). Additionally, we provide a checklist for ethical AI-RTS to help guide the decision-making of the stakeholders involved in the process.
SEJul 20, 2020
Agent-Based Software Testing: A Definition and Systematic Mapping StudyPavithra Perumal Kumaresen, Mirgita Frasheri, Eduard Enoiu
The emergence of new technologies in software testing has increased the automation and flexibility of the testing process. In this context, the adoption of agents in software testing remains an active research area in which various agent methodologies, architectures, and tools are employed to improve different test problems. Even though research that investigates agents in software testing has been growing, these agent-based techniques should be considered in a broader perspective. In order to provide a comprehensive overview of this research area, which we define as agent-based software testing (ABST), a systematic mapping study has been conducted. This mapping study aims to identify the topics studied within ABST, as well as examine the adopted research methodologies, identify the gaps in the current research and point to directions for future ABST research. Our results suggest that there is an interest in ABST after 1999 that resulted in the development of solutions using reactive, BDI, deliberative and cooperate agent architectures for software testing. In addition, most of the ABST approaches are designed using the JADE framework, have targeted the Java programming language, and are used at system-level testing for functional, non-functional and white-box testing. In regards to regression testing, our results indicate a research gap that could be addressed in future studies.
SEFeb 12, 2018
Test Agents: Adaptive, Autonomous and Intelligent Test CasesEduard Enoiu, Mirgita Frasheri
Growth of software size, lack of resources to perform regression testing, and failure to detect bugs faster have seen increased reliance on continuous integration and test automation. Even with greater hardware and software resources dedicated to test automation, software testing is faced with enormous challenges, resulting in increased dependence on complex mechanisms for automated test case selection and prioritization as part of a continuous integration framework. These mechanisms are currently using simple entities called test cases that are concretely realized as executable scripts. Our key idea is to provide test cases with more reasoning, adaptive behavior and learning capabilities by using the concepts of intelligent software agents. We refer to such test cases as test agents. The model that underlie a test agent is capable of flexible and autonomous actions in order to meet overall testing objectives. Our goal is to increase the decentralization of regression testing by letting test agents to know for themselves when they should be executing, how they should update their purpose, and when they should interact with each other. In this paper, we envision software test agents that display such adaptive autonomous behavior. Emerging developments and challenges regarding the use of test agents are explored-in particular, new research that seeks to use adaptive autonomous agents in software testing.