Ruturaj K. Vaidya

SEMar 6, 2020

SpellBound: Defending Against Package Typosquatting

Matthew Taylor, Ruturaj K. Vaidya, Drew Davidson et al.

Package managers for software repositories based on a single programming language are very common. Examples include npm (JavaScript), and PyPI (Python). These tools encourage code reuse, making it trivial for developers to import external packages. Unfortunately, repositories' size and the ease with which packages can be published facilitates the practice of typosquatting: the uploading of a package with name similar to that of a highly popular package, typically with the aim of capturing some of the popular package's installs. Typosquatting has serious negative implications, resulting in developers importing malicious packages, or -- as we show -- code clones which do not incorporate recent security updates. In order to tackle this problem, we present SpellBound, a tool for identifying and reporting potentially erroneous imports to developers. SpellBound implements a novel typosquatting detection technique, based on an in-depth analysis of npm and PyPI. Our technique leverages a model of lexical similarity between names, and further incorporates the notion of package popularity. This approach flags cases where unknown/scarcely used packages would be installed in place of popular ones with similar names, before installation occurs. We evaluated SpellBound on both npm and PyPI, with encouraging results: SpellBound flags typosquatting cases while generating limited warnings (0.5% of total package installs), and low overhead (only 2.5% of package install time). Furthermore, SpellBound allowed us to confirm known cases of typosquatting and discover one high-profile, unknown case of typosquatting that resulted in a package takedown by the npm security team.

CRMar 6, 2019

Security Issues in Language-based Software Ecosystems

Ruturaj K. Vaidya, Lorenzo De Carli, Drew Davidson et al.

Language-based ecosystems (LBE), i.e., software ecosystems based on a single programming language, are very common. Examples include the npm ecosystem for JavaScript, and PyPI for Python. These environments encourage code reuse between packages, and incorporate utilities - package managers - for automatically resolving dependencies. However, the same aspects that make these systems popular - ease of publishing code and importing external code - also create novel security issues, which have so far seen little study. We present an a systematic study of security issues that plague LBEs. These issues are inherent to the ways these ecosystems work and cannot be resolved by fixing software vulnerabilities in either the packages or the utilities, e.g., package manager tools, that build these ecosystems. We systematically characterize recent security attacks from various aspects, including attack strategies, vectors, and goals. Our characterization and in-depth analysis of npm and PyPI ecosystems, which represent the largest LBEs, covering nearly one million packages indicates that these ecosystems make an opportune environment for attackers to incorporate stealthy attacks. Overall, we argue that (i) fully automated detection of malicious packages is likely to be unfeasible; however (ii) tools and metrics that help developers assess the risk of including external dependencies would go a long way toward preventing attacks.

Ruturaj K. Vaidya

2 Papers