Omar Alrawi

3papers

116citations

Novelty38%

AI Score21

Ranked #190,439 of 205,806 authors (top 93%)#6,162 in CR (top 84%)

3 Papers

CRFeb 4, 2020

Towards Measuring Supply Chain Attacks on Package Managers for Interpreted Languages

Ruian Duan, Omar Alrawi, Ranjita Pai Kasturi et al.

Package managers have become a vital part of the modern software development process. They allow developers to reuse third-party code, share their own code, minimize their codebase, and simplify the build process. However, recent reports showed that package managers have been abused by attackers to distribute malware, posing significant security risks to developers and end-users. For example, eslint-scope, a package with millions of weekly downloads in Npm, was compromised to steal credentials from developers. To understand the security gaps and the misplaced trust that make recent supply chain attacks possible, we propose a comparative framework to qualitatively assess the functional and security features of package managers for interpreted languages. Based on qualitative assessment, we apply well-known program analysis techniques such as metadata, static, and dynamic analysis to study registry abuse. Our initial efforts found 339 new malicious packages that we reported to the registries for removal. The package manager maintainers confirmed 278 (82%) from the 339 reported packages where three of them had more than 100,000 downloads. For these packages we were issued official CVE numbers to help expedite the removal of these packages from infected victims. We outline the challenges of tailoring program analysis tools to interpreted languages and release our pipeline as a reference point for the community to build on and help in securing the software supply chain.

CRJan 4, 2019

Network-based Analysis and Classification of Malware using Behavioral Artifacts Ordering

Aziz Mohaisen, Omar Alrawi, Jeman Park et al.

Using runtime execution artifacts to identify malware and its associated family is an established technique in the security domain. Many papers in the literature rely on explicit features derived from network, file system, or registry interaction. While effective, the use of these fine-granularity data points makes these techniques computationally expensive. Moreover, the signatures and heuristics are often circumvented by subsequent malware authors. In this work, we propose Chatter, a system that is concerned only with the order in which high-level system events take place. Individual events are mapped onto an alphabet and execution traces are captured via terse concatenations of those letters. Then, leveraging an analyst labeled corpus of malware, n-gram document classification techniques are applied to produce a classifier predicting malware family. This paper describes that technique and its proof-of-concept evaluation. In its prototype form, only network events are considered and eleven malware families are used. We show the technique achieves 83%-94% accuracy in isolation and makes non-trivial performance improvements when integrated with a baseline classifier of combined order features to reach an accuracy of up to 98.8%.

CRMar 28, 2013

Unveiling Zeus

Abedelaziz Mohaisen, Omar Alrawi

Malware family classification is an age old problem that many Anti-Virus (AV) companies have tackled. There are two common techniques used for classification, signature based and behavior based. Signature based classification uses a common sequence of bytes that appears in the binary code to identify and detect a family of malware. Behavior based classification uses artifacts created by malware during execution for identification. In this paper we report on a unique dataset we obtained from our operations and classified using several machine learning techniques using the behavior-based approach. Our main class of malware we are interested in classifying is the popular Zeus malware. For its classification we identify 65 features that are unique and robust for identifying malware families. We show that artifacts like file system, registry, and network features can be used to identify distinct malware families with high accuracy---in some cases as high as 95%.