SEApr 21
FIKA: Expanding Dependency Reachability with Executability GuaranteesYogya Gamage, Meriem Ben Chaaben, Martin Monperrus et al.
Automated third-party library analysis tools help developers by addressing key dependency management challenges, such as automating version updates, detecting vulnerabilities, and detecting breaking updates. Dependency reachability analysis aims at improving the precision of dependency management, by reducing the space of dependency issues to the ones that actually matter. Most tools for dependency reachability analysis are static and fundamentally limited by the absence of execution. In this paper, we propose FIKA, a pipeline for providing guarantees of executability for third-party library call sites. FIKA generates code that is executed, and whose execution trace provides guarantees that a third-party library call site is actually reachable. We apply our approach to a dataset of eight Java projects to empirically evaluate the effectiveness of FIKA. On average, 54% of these call sites are covered by the existing test suites, and therefore, have evidence for their executability. FIKA further improves this coverage by 20% and is able to demonstrate executability for 2363 dependency methods. In six out of eight projects, FIKA provides strong guarantees that more than 75% of call sites are executable. We further demonstrate that FIKA is capable of improving the results provided by Semgrep, a state-of-the-art static vulnerability reachability analysis tool. We show that FIKA can help prioritize the vulnerability updates with stronger guarantees of executability in cases where Semgrep yields inconclusive reachability results.
SEOct 27, 2025
The Design Space of Lockfiles Across Package ManagersYogya Gamage, Deepika Tiwari, Martin Monperrus et al.
Software developers reuse third-party packages that are hosted in package registries. At build time, a package manager resolves and fetches the direct and indirect dependencies of a project. Most package managers also generate a lockfile, which records the exact set of resolved dependency versions. Lockfiles are used to reduce build times; to verify the integrity of resolved packages; and to support build reproducibility across environments and time. Despite these beneficial features, developers often struggle with their maintenance, usage, and interpretation. In this study, we unveil the major challenges related to lockfiles, such that future researchers and engineers can address them. We perform the first comprehensive study of lockfiles across 7 popular package managers, npm, pnpm, Cargo, Poetry, Pipenv, Gradle, and Go. First, we highlight the wide variety of design decisions that package managers make, regarding the generation process as well as the content of lockfiles. Next, we conduct a qualitative analysis based on semi-structured interviews with 15 developers. We capture first-hand insights about the benefits that developers perceive in lockfiles, as well as the challenges they face to manage these files. Following these observations, we make 5 recommendations to further improve lockfiles, for a better developer experience.
SEMar 17
Unpacking Security Scanners for GitHub Actions WorkflowsMadjda Fares, Yogya Gamage, Benoit Baudry
GitHub Actions is a widely used platform to automate the build and deployment of software projects through configurable workflows. As the platform's popularity grows, it also becomes a target of choice for software supply chain attacks. These attacks exploit excessive permissions, ambiguous versions or the absence of artifact integrity checks to compromise the workflows. In response to these attacks, several security scanners have emerged to help developers harden their workflows. In this paper, we perform the first systematic comparison of 9 GitHub Actions Workflows security scanners. We compare them regarding scope (which security weaknesses they target), detection capabilities (how many weaknesses they detect), and performance (how long they take to scan a workflow). In order to compare the scanners on a common ground, we first establish a classification of 10 common security weaknesses that can be found in GitHub Actions Workflows. Then, we run the scanners against a curated set of 2722 workflows. Our study reveals that the landscape of GitHub Actions Workflows security scanners is very diverse, with both general purpose and focused scanners. More importantly, we provide evidence that these scanners implement fundamentally different analysis strategies, leading to major gaps regarding the nature and the number of reported security weaknesses. Based on these empirical evidence we make actionable recommendations for developers to harden their GitHub Actions Workflows.
SEJan 31, 2024
Generative AI to Generate Test Data GeneratorsBenoit Baudry, Khashayar Etemadi, Sen Fang et al.
Generating fake data is an essential dimension of modern software testing, as demonstrated by the number and significance of data faking libraries. Yet, developers of faking libraries cannot keep up with the wide range of data to be generated for different natural languages and domains. In this paper, we assess the ability of generative AI for generating test data in different domains. We design three types of prompts for Large Language Models (LLMs), which perform test data generation tasks at different levels of integrability: 1) raw test data generation, 2) synthesizing programs in a specific language that generate useful test data, and 3) producing programs that use state-of-the-art faker libraries. We evaluate our approach by prompting LLMs to generate test data for 11 domains. The results show that LLMs can successfully generate realistic test data generators in a wide range of domains at all three levels of integrability.