Andrea Arcuri

SE
h-index15
10papers
415citations
Novelty30%
AI Score49

10 Papers

SEApr 2Code
Fuzzing REST APIs in Industry: Necessary Features and Open Problems

Andrea Arcuri, Alexander Poth, Olsi Rrjolli et al.

REST APIs are widely used in industry, in all different kinds of domains. An example is Volkswagen AG, a German automobile manufacturer. Established testing approaches for REST APIs are time consuming, and require expertise from professional test engineers. Due to its cost and importance, in the scientific literature several approaches have been proposed to automatically test REST APIs. The open-source, search-based fuzzer EvoMaster is one of such tools proposed in the academic literature. However, how academic prototypes can be integrated in industry and have real impact to software engineering practice requires more investigation. In this paper, we report on our experience in using EvoMaster at Volkswagen AG, as an EvoMaster user from 2023 to 2026. We share our learnt lessons, and discuss several features needed to be implemented in EvoMaster to make its use in an industrial context successful. Feedback about value in industrial setups of EvoMaster was given from Volkswagen AG about 4 APIs. Additionally, a user study was conducted involving 11 testing specialists from 4 different companies. We further identify several real-world research challenges that still need to be solved.

SEApr 1Code
Enhancing REST API Fuzzing with Access Policy Violation Checks and Injection Attacks

Omur Sahin, Man Zhang, Andrea Arcuri

Due to their widespread use in industry, several techniques have been proposed in the literature to fuzz REST APIs. Existing fuzzers for REST APIs have been focusing on detecting crashes (e.g., 500 HTTP server error status code). However, security vulnerabilities can have major drastic consequences on existing cloud infrastructures. In this paper, we propose a series of novel automated oracles aimed at detecting violations of access policies in REST APIs, as well as executing traditional attacks such as SQL Injection and XSS. These novel automated oracles can be integrated into existing fuzzers, in which, once the fuzzing session is completed, a ``security testing'' phase is executed to verify these oracles. When a security fault is detected, as output our technique is able to general executable test cases in different formats, like Java, Kotlin, Python and JavaScript test suites. Our novel techniques are integrated as an extension of EvoMaster, a state-of-the-art open-source fuzzer for REST APIs. Experiments are carried out on 9 artificial examples, 8 vulnerable-by-design REST APIs with black-box testing, and 36 REST APIs from the WFD corpus with white-box testing, for a total of 52 distinct APIs. Results show that our novel oracles and their automated integration in a fuzzing process can lead to detect security issues in several of these APIs.

SEMar 30
Detecting and Mitigating Flakiness in REST API Fuzzing

Man Zhang, Chongyang Shen, Andrea Arcuri et al.

Test flakiness is a common problem in industry, which hinders the reliability of automated build and testing workflows. Most existing research on test flakiness has primarily focused on unit and small-scale integration tests. In contrast, flakiness in system-level testing such as REST APIs are comparatively under-explored. A large body of literature has been dedicated to the topic of fuzzing REST APIs, whereas relatively little attention has been paid to detecting and possibly mitigating negative effects of flakiness in this context. To fill this major gap, in this paper, we study the flakiness of tests generated by one of the popularly applied REST API fuzzer in the literature, namely EvoMaster, conduct empirical studies with a corpus of 36 REST APIs to understand flakiness of REST APIs. Based on the results of the empirical studies, we categorize and analyze flakiness sources by inspecting near 3000 failing tests. Based on the understanding, we propose FlakyCatch to detect and mitigate flakiness in REST APIs and empirically evaluate its performance. Results show that FlakyCatch is effective in detecting and handling flakiness in tests generated by white-box and black-box fuzzers.

SEApr 5, 2021Code
Model-based testing in practice: An experience report from the web applications domain

Vahid Garousi, Alper Buğra Keleş, Yunus Balaman et al.

In the context of a large software testing company, we have deployed the model-based testing (MBT) approach to take the company's test automation practices to higher levels of maturity /and capability. We have chosen, from a set of open-source/commercial MBT tools, an open-source tool named GraphWalker, and have pragmatically used MBT for end-to-end test automation of several large web and mobile applications under test. The MBT approach has provided, so far in our project, various tangible and intangible benefits in terms of improved test coverage (number of paths tested), improved test-design practices, and also improved real-fault detection effectiveness. The goal of this experience report (applied research report), done based on "action research", is to share our experience of applying and evaluating MBT as a software technology (technique and tool) in a real industrial setting. We aim at contributing to the body of empirical evidence in industrial application of MBT by sharing our industry-academia project on applying MBT in practice, the insights that we have gained, and the challenges and questions that we have faced and tackled so far. We discuss an overview of the industrial setting, provide motivation, explain the events leading to the outcomes, discuss the challenges faced, summarize the outcomes, and conclude with lessons learned, take-away messages, and practical advices based on the described experience. By learning from the best practices in this paper, other test engineers could conduct more mature MBT in their test projects.

SEJan 12, 2019Code
An Experience Report On Applying Software Testing Academic Results In Industry: We Need Usable Automated Test Generation

Andrea Arcuri

What is the impact of software engineering research on current practices in industry? In this paper, I report on my direct experience as a PhD/post-doc working in software engineering research projects, and then spending the following five years as an engineer in two different companies (the first one being the same I worked in collaboration with during my post-doc). Given a background in software engineering research, what cutting-edge techniques and tools from academia did I use in my daily work when developing and testing the systems of these companies? Regarding validation and verification (my main area of research), the answer is rather short: as far as I can tell, only FindBugs. In this paper, I report on why this was the case, and discuss all the challenging, complex open problems we face in industry and which somehow are "neglected" in the academic circles. In particular, I will first discuss what actual tools I could use in my daily work, such as JaCoCo and Selenium. Then, I will discuss the main open problems I faced, particularly related to environment simulators, unit and web testing. After that, popular topics in academia are presented, such as UML, regression and mutation testing. Their lack of impact on the type of projects I worked on in industry is then discussed. Finally, from this industrial experience, I provide my opinions about how this situation can be improved, in particular related to how academics are evaluated, and advocate for a greater involvement into open-source projects.

SEJan 12, 2019Code
EvoMaster: Evolutionary Multi-context Automated System Test Generation

Andrea Arcuri

This paper presents EvoMaster, an open-source tool that is able to automatically generate system level test cases using evolutionary algorithms. Currently, EvoMaster targets RESTful web services running on JVM technology, and has been used to find several faults in existing open-source projects. We discuss some of the architectural decisions made for its implementation, and future work.

SEJan 6, 2019Code
RESTful API Automated Test Case Generation

Andrea Arcuri

Nowadays, web services play a major role in the development of enterprise applications. Many such applications are now developed using a service-oriented architecture (SOA), where microservices is one of its most popular kind. A RESTful web service will provide data via an API over the network using HTTP, possibly interacting with databases and other web services. Testing a RESTful API poses challenges, as inputs/outputs are sequences of HTTP requests/responses to a remote server. Many approaches in the literature do black-box testing, as the tested API is a remote service whose code is not available. In this paper, we consider testing from the point of view of the developers, which do have full access to the code that they are writing. Therefore, we propose a fully automated white-box testing approach, where test cases are automatically generated using an evolutionary algorithm. Tests are rewarded based on code coverage and fault finding metrics. We implemented our technique in a tool called EVOMASTER, which is open-source. Experiments on two open-source, yet non-trivial RESTful services and an industrial one, do show that our novel technique did automatically find 38 real bugs in those applications. However, obtained code coverage is lower than the one achieved by the manually written test suites already existing in those services. Research directions on how to further improve such approach are therefore discussed.

SEMay 26, 2025
Search-Based Software Engineering and AI Foundation Models: Current Landscape and Future Roadmap

Hassan Sartaj, Shaukat Ali, Paolo Arcaini et al.

Search-based software engineering (SBSE), which integrates metaheuristic search techniques with software engineering, has been an active area of research for about 25 years. It has been applied to solve numerous problems across the entire software engineering lifecycle and has demonstrated its versatility in multiple domains. With recent advances in AI, particularly the emergence of foundation models (FMs) such as large language models (LLMs), the evolution of SBSE alongside these models remains undetermined. In this window of opportunity, we present a research roadmap that articulates the current landscape of SBSE in relation to FMs, identifies open challenges, and outlines potential research directions to advance SBSE through its integration and interplay with FMs. Specifically, we analyze five core aspects: leveraging FMs for SBSE design, applying FMs to complement SBSE in SE problems, employing SBSE to address FM challenges, adapting SBSE practices for FMs tailored to SE activities, and exploring the synergistic potential between SBSE and FMs. Furthermore, we present a forward-thinking perspective that envisions the future of SBSE in the era of FMs, highlighting promising research opportunities to address challenges in emerging domains.

SEMar 8, 2020
Software-testing education: A systematic literature mapping

Vahid Garousi, Austen Rainer, Per Lauvås et al.

Context: With the rising complexity and scale of software systems, there is an ever-increasing demand for sophisticated and cost-effective software testing. To meet such a demand, there is a need for a highly-skilled software testing work-force (test engineers) in the industry. To address that need, many university educators worldwide have included software-testing education in their software engineering (SE) or computer science (CS) programs. Objective: Our objective in this paper is to summarize the body of experience and knowledge in the area of software-testing education to benefit the readers (both educators and researchers) in designing and delivering software testing courses in university settings, and to also conduct further education research in this area. Method: To address the above need, we conducted a systematic literature mapping (SLM) to synthesize what the community of educators have published on this topic. After compiling a candidate pool of 307 papers, and applying a set of inclusion/exclusion criteria, our final pool included 204 papers published between 1992 and 2019. Results: The topic of software-testing education is becoming more active, as we can see by the increasing number of papers. Many pedagogical approaches (how to best teach testing), course-ware, and specific tools for testing education have been proposed. Many challenges in testing education and insights on how to overcome those challenges have been proposed. Conclusion: This paper provides educators and researchers with a classification of existing studies within software-testing education. We further synthesize challenges and insights reported when teaching software testing. The paper also provides a reference ("index") to the vast body of knowledge and experience on teaching software testing.

SEJan 6, 2019
Many Independent Objective (MIO) Algorithm for Test Suite Generation

Andrea Arcuri

Automatically generating test suites is intrinsically a multi-objective problem, as any of the testing targets (e.g, statements to execute or mutants to kill) is an objective on its own. Test suite generation has peculiarities that are quite different from other more regular optimisation problems. For example, given an existing test suite, one can add more tests to cover the remaining objectives. One would like the smallest number of small tests to cover as many objectives as possible, but that is a secondary goal compared to covering those targets in the first place. Furthermore, the amount of objectives in software testing can quickly become unmanageable, in the order of (tens/hundreds of) thousands, especially for system testing of industrial size systems. Traditional multi-objective optimisation algorithms can already start to struggle with just four or five objectives to optimize. To overcome these issues, different techniques have been proposed, like for example the Whole Test Suite (WTS) approach and the Many-Objective Sorting Algorithm (MOSA). However, those techniques might not scale well to very large numbers of objectives and limited search budgets (a typical case in system testing). In this paper, we propose a novel algorithm, called Many Independent Objective (MIO) algorithm. This algorithm is designed and tailored based on the specific properties of test suite generation. An empirical study, on a set of artificial and actual software, shows that the MIO algorithm can achieve higher coverage compared to WTS and MOSA, as it can better exploit the peculiarities of test suite generation.