CRDec 3, 2025
One Detector Fits All: Robust and Adaptive Detection of Malicious Packages from PyPI to EnterprisesBiagio Montaruli, Luca Compagna, Serena Elisa Ponta et al.
The rise of supply chain attacks via malicious Python packages demands robust detection solutions. Current approaches, however, overlook two critical challenges: robustness against adversarial source code transformations and adaptability to the varying false positive rate (FPR) requirements of different actors, from repository maintainers (requiring low FPR) to enterprise security teams (higher FPR tolerance). We introduce a robust detector capable of seamless integration into both public repositories like PyPI and enterprise ecosystems. To ensure robustness, we propose a novel methodology for generating adversarial packages using fine-grained code obfuscation. Combining these with adversarial training (AT) enhances detector robustness by 2.5x. We comprehensively evaluate AT effectiveness by testing our detector against 122,398 packages collected daily from PyPI over 80 days, showing that AT needs careful application: it makes the detector more robust to obfuscations and allows finding 10% more obfuscated packages, but slightly decreases performance on non-obfuscated packages. We demonstrate production adaptability of our detector via two case studies: (i) one for PyPI maintainers (tuned at 0.1% FPR) and (ii) one for enterprise teams (tuned at 10% FPR). In the former, we analyze 91,949 packages collected from PyPI over 37 days, achieving a daily detection rate of 2.48 malicious packages with only 2.18 false positives. In the latter, we analyze 1,596 packages adopted by a multinational software company, obtaining only 1.24 false positives daily. These results show that our detector can be seamlessly integrated into both public repositories like PyPI and enterprise ecosystems, ensuring a very low time budget of a few minutes to review the false positives. Overall, we uncovered 346 malicious packages, now reported to the community.
SEAug 11, 2020Code
Code-based Vulnerability Detection in Node.js Applications: How far are we?Bodin Chinthanet, Serena Elisa Ponta, Henrik Plate et al.
With one of the largest available collection of reusable packages, the JavaScript runtime environment Node.js is one of the most popular programming application. With recent work showing evidence that known vulnerabilities are prevalent in both open source and industrial software, we propose and implement a viable code-based vulnerability detection tool for Node.js applications. Our case study lists the challenges encountered while implementing our Node.js vulnerable code detector.
SEAug 29, 2018Code
Vulnerable Open Source Dependencies: Counting Those That MatterIvan Pashchenko, Henrik Plate, Serena Elisa Ponta et al.
BACKGROUND: Vulnerable dependencies are a known problem in today's open-source software ecosystems because OSS libraries are highly interconnected and developers do not always update their dependencies. AIMS: In this paper we aim to present a precise methodology, that combines the code-based analysis of patches with information on build, test, update dates, and group extracted from the very code repository, and therefore, caters to the needs of industrial practice for correct allocation of development and audit resources. METHOD: To understand the industrial impact of the proposed methodology, we considered the 200 most popular OSS Java libraries used by SAP in its own software. Our analysis included 10905 distinct GAVs (group, artifact, version) when considering all the library versions. RESULTS: We found that about 20% of the dependencies affected by a known vulnerability are not deployed, and therefore, they do not represent a danger to the analyzed library because they cannot be exploited in practice. Developers of the analyzed libraries are able to fix (and actually responsible for) 82% of the deployed vulnerable dependencies. The vast majority (81%) of vulnerable dependencies may be fixed by simply updating to a new version, while 1% of the vulnerable dependencies in our sample are halted, and therefore, potentially require a costly mitigation strategy. CONCLUSIONS: Our case study shows that the correct counting allows software development companies to receive actionable information about their library dependencies, and therefore, correctly allocate costly development and audit resources, which is spent inefficiently in case of distorted measurements.
CRApr 20, 2015Code
Impact assessment for vulnerabilities in open-source software librariesHenrik Plate, Serena Elisa Ponta, Antonino Sabetta
Software applications integrate more and more open-source software (OSS) to benefit from code reuse. As a drawback, each vulnerability discovered in bundled OSS potentially affects the application. Upon the disclosure of every new vulnerability, the application vendor has to decide whether it is exploitable in his particular usage context, hence, whether users require an urgent application patch containing a non-vulnerable version of the OSS. Current decision making is mostly based on high-level vulnerability descriptions and expert knowledge, thus, effort intense and error prone. This paper proposes a pragmatic approach to facilitate the impact assessment, describes a proof-of-concept for Java, and examines one example vulnerability as case study. The approach is independent from specific kinds of vulnerabilities or programming languages and can deliver immediate results.
CRJun 28, 2012Code
Detection of Configuration Vulnerabilities in Distributed (Web) EnvironmentsMatteo Maria Casalino, Michele Mangili, Henrik Plate et al.
Many tools and libraries are readily available to build and operate distributed Web applications. While the setup of operational environments is comparatively easy, practice shows that their continuous secure operation is more difficult to achieve, many times resulting in vulnerable systems exposed to the Internet. Authenticated vulnerability scanners and validation tools represent a means to detect security vulnerabilities caused by missing patches or misconfiguration, but current approaches center much around the concepts of hosts and operating systems. This paper presents a language and an approach for the declarative specification and execution of machine-readable security checks for sets of more fine-granular system components depending on each other in a distributed environment. Such a language, building on existing standards, fosters the creation and sharing of security content among security stakeholders. Our approach is exemplified by vulnerabilities of and corresponding checks for Open Source Software commonly used in today's Internet applications.
SEAug 11, 2021
The Used, the Bloated, and the Vulnerable: Reducing the Attack Surface of an Industrial ApplicationSerena Elisa Ponta, Wolfram Fischer, Henrik Plate et al.
Software reuse may result in software bloat when significant portions of application dependencies are effectively unused. Several tools exist to remove unused (byte)code from an application or its dependencies, thus producing smaller artifacts and, potentially, reducing the overall attack surface. In this paper we evaluate the ability of three debloating tools to distinguish which dependency classes are necessary for an application to function correctly from those that could be safely removed. To do so, we conduct a case study on a real-world commercial Java application. Our study shows that the tools we used were able to correctly identify a considerable amount of redundant code, which could be removed without altering the results of the existing application tests. One of the redundant classes turned out to be (formerly) vulnerable, confirming that this technique has the potential to be applied for hardening purposes. However, by manually reviewing the results of our experiments, we observed that none of the tools can handle a widely used default mechanism for dynamic class loading.
SEJul 27, 2015
Modularity for Security-Sensitive WorkflowsDaniel Ricardo dos Santos, Silvio Ranise, Serena Elisa Ponta
An established trend in software engineering insists on using components (sometimes also called services or packages) to encapsulate a set of related functionalities or data. By defining interfaces specifying what functionalities they provide or use, components can be combined with others to form more complex components. In this way, IT systems can be designed by mostly re-using existing components and developing new ones to provide new functionalities. In this paper, we introduce a notion of component and a combination mechanism for an important class of software artifacts, called security-sensitive workflows. These are business processes in which execution constraints on the tasks are complemented with authorization constraints (e.g., Separation of Duty) and authorization policies (constraining which users can execute which tasks). We show how well-known workflow execution patterns can be simulated by our combination mechanism and how authorization constraints can also be imposed across components. Then, we demonstrate the usefulness of our notion of component by showing (i) the scalability of a technique for the synthesis of run-time monitors for security-sensitive workflows and (ii) the design of a plug-in for the re-use of workflows and related run-time monitors inside an editor for security-sensitive workflows.