LGJan 9, 2020
Privacy-Preserving Deep Learning Computation for Geo-Distributed Medical Big-Data PlatformsJoohyung Jeon, Junhui Kim, Joongheon Kim et al.
This paper proposes a distributed deep learning framework for privacy-preserving medical data training. In order to avoid patients' data leakage in medical platforms, the hidden layers in the deep learning framework are separated and where the first layer is kept in platform and others layers are kept in a centralized server. Whereas keeping the original patients' data in local platforms maintain their privacy, utilizing the server for subsequent layers improves learning performance by using all data from each platform during training.
CRNov 5, 2019
Phoenix: Towards Persistently Secure, Recoverable, and NVM Friendly Tree of CountersMazen Alwadi, Aziz Mohaisen, Amro Awad
Emerging Non-Volatile Memories (NVMs) bring a unique challenge to the security community, namely persistent security. As NVM-based memories are expected to restore their data after recovery, the security metadata must be recovered as well. However, persisting all affected security metadata on each memory write would significantly degrade performance and exacerbate the write endurance problem. Moreover, recovery time can increase significantly (up to hours for practical memory sizes) when security metadata are not updated strictly. Counter trees are used in state-of-the-art commercial secure processors, e.g., Intel's Safe Guard Extension (SGX). Counter trees have a unique challenge due to the inability to recover the whole tree from leaves. Thus, to ensure recoverability, all updates to the tree must be persisted, which can be tens of additional writes on each write. The state-of-art scheme, Anubis, enables recoverability but incurs an additional write per cache eviction, i.e., reduces lifetime to approximately half. Additionally, Anubis degrades performance significantly in many cases. In this paper, we propose Phoenix, a practical novel scheme which relies on elegantly reproducing the cache content before a crash, however with minimal overheads. Our evaluation results show that Phoenix reduces persisting security metadata overhead writes from 87\% extra writes (for Anubis) to less than write-back compared to an encrypted system without recovery, thus improving the NVM lifetime by 2x. Overall Phoenix performance is better than the baseline, unlike Anubis which adds 7.9\% (max of 35\%) performance overhead.
CROct 20, 2019
You Can Run, But You Cannot Hide: Using Elevation Profiles to Breach Location Privacy through Trajectory PredictionÜlkü Meteriz, Necip Fazıl Yıldıran, Aziz Mohaisen
The extensive use of smartphones and wearable devices has facilitated many useful applications. For example, with Global Positioning System (GPS)-equipped smart and wearable devices, many applications can gather, process, and share rich metadata, such as geolocation, trajectories, elevation, and time. For example, fitness applications, such as Strava and Runkeeper, utilize information for activity tracking, and have recently witnessed a boom in popularity. Those trackers have their own web platforms, and allow users to share activities on such platforms, or even with other social network platforms. To preserve privacy of users while allowing sharing, those platforms allow users to disclose partial information, such as the elevation profile for an activity, which supposedly will not leak the location trajectory. In this work we examine the extent to which publicly available elevation profiles can be used to predict the location trajectory of users. To tackle this problem, we devise three threat settings under which the city, borough, or even a route can be predicted. Those threat settings define the amount of information available to the adversary to launch the prediction attacks. Establishing that simple features of elevation profiles, e.g., spectral features, are insufficient, we devise both natural language processing (NLP)-inspired text-like representation and computer vision-inspired image-like representation of elevation profiles, and we convert the problem at hand into text and image classification problem. We use both traditional machine learning- and deep learning-based techniques, and achieve a prediction success rate ranging from 59.59% to 95.83%. The findings are alarming, and highlight that sharing information such as elevation profile may have significant privacy risks.
IVOct 2, 2019
W-Net: A CNN-based Architecture for White Blood Cells Image ClassificationChanghun Jung, Mohammed Abuhamad, Jumabek Alikhanov et al.
Computer-aided methods for analyzing white blood cells (WBC) have become widely popular due to the complexity of the manual process. Recent works have shown highly accurate segmentation and detection of white blood cells from microscopic blood images. However, the classification of the observed cells is still a challenge and highly demanded as the distribution of the five types reflects on the condition of the immune system. This work proposes W-Net, a CNN-based method for WBC classification. We evaluate W-Net on a real-world large-scale dataset, obtained from The Catholic University of Korea, that includes 6,562 real images of the five WBC types. W-Net achieves an average accuracy of 97%.
CRSep 20, 2019
COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware DetectionAminollah Khormali, Ahmed Abusnaina, Songqing Chen et al.
Despite many attempts, the state-of-the-art of adversarial machine learning on malware detection systems generally yield unexecutable samples. In this work, we set out to examine the robustness of visualization-based malware detection system against adversarial examples (AEs) that not only are able to fool the model, but also maintain the executability of the original input. As such, we first investigate the application of existing off-the-shelf adversarial attack approaches on malware detection systems through which we found that those approaches do not necessarily maintain the functionality of the original inputs. Therefore, we proposed an approach to generate adversarial examples, COPYCAT, which is specifically designed for malware detection systems considering two main goals; achieving a high misclassification rate and maintaining the executability and functionality of the original input. We designed two main configurations for COPYCAT, namely AE padding and sample injection. While the first configuration results in untargeted misclassification attacks, the sample injection configuration is able to force the model to generate a targeted output, which is highly desirable in the malware attribution setting. We evaluate the performance of COPYCAT through an extensive set of experiments on two malware datasets, and report that we were able to generate adversarial samples that are misclassified at a rate of 98.9% and 96.5% with Windows and IoT binary datasets, respectively, outperforming the misclassification rates in the literature. Most importantly, we report that those AEs were executable unlike AEs generated by off-the-shelf approaches. Our transferability study demonstrates that the generated AEs through our proposed method can be generalized to other models.
CRSep 9, 2019
A Privacy-Preserving Longevity Study of Tor's Hidden ServicesAmirali Sanatinia, Jeman Park, Erik-Oliver Blass et al.
Tor and hidden services have emerged as a practical solution to protect user privacy against tracking and censorship. At the same time, little is known about the lifetime and nature of hidden services. Data collection and study of Tor hidden services is challenging due to its nature of providing privacy. Studying the lifetime of hidden services provides several benefits. For example, it allows investigation of the maliciousness of domains based on their lifetime. Short-lived hidden services are more likely not to be legitimate domains, e.g., used by ransomware, as compared to long-lived domains. In this work, we investigate the lifetime of hidden services by collecting data from a small (2%) subset of all Tor HSDir relays in a privacy-preserving manner. Based on the data collected, we devise protocols and extrapolation techniques to infer the lifetime of hidden services. Moreover we show that, due to Tor's specifics, our small subset of HSDir relays is sufficient to extrapolate lifetime with high accuracy, while respecting Tor user and service privacy and following Tor's research safety guidelines. Our results indicate that a large majority of the hidden services have a very short lifetime. In particular, 50% of all current Tor hidden services have an estimate lifetime of only 10 days or less, and 80% have a lifetime of less than a month.
CRApr 6, 2019
Exploring the Attack Surface of Blockchain: A Systematic OverviewMuhammad Saad, Jeffrey Spaulding, Laurent Njilla et al.
In this paper, we systematically explore the attack surface of the Blockchain technology, with an emphasis on public Blockchains. Towards this goal, we attribute attack viability in the attack surface to 1) the Blockchain cryptographic constructs, 2) the distributed architecture of the systems using Blockchain, and 3) the Blockchain application context. To each of those contributing factors, we outline several attacks, including selfish mining, the 51% attack, Domain Name System (DNS) attacks, distributed denial-of-service (DDoS) attacks, consensus delay (due to selfish behavior or distributed denial-of-service attacks), Blockchain forks, orphaned and stale blocks, block ingestion, wallet thefts, smart contract attacks, and privacy attacks. We also explore the causal relationships between these attacks to demonstrate how various attack vectors are connected to one another. A secondary contribution of this work is outlining effective defense measures taken by the Blockchain technology or proposed by researchers to mitigate the effects of these attacks and patch associated vulnerabilities
CRMar 2, 2019
Detecting and Classifying Android Malware using Static Analysis along with Creator InformationHyunjae Kang, Jae-wook Jang, Aziz Mohaisen et al.
Thousands of malicious applications targeting mobile devices, including the popular Android platform, are created every day. A large number of those applications are created by a small number of professional under-ground actors, however previous studies overlooked such information as a feature in detecting and classifying malware, and in attributing malware to creators. Guided by this insight, we propose a method to improve on the performance of Android malware detection by incorporating the creator's information as a feature and classify malicious applications into similar groups. We developed a system that implements this method in practice. Our system enables fast detection of malware by using creator information such as serial number of certificate. Additionally, it analyzes malicious be-haviors and permissions to increase detection accuracy. The system also can classify malware based on similarity scoring. Finally, we showed detection and classification performance with 98% and 90% accuracy respectively.
CRFeb 12, 2019
Examining Adversarial Learning against Graph-based IoT Malware Detection SystemsAhmed Abusnaina, Aminollah Khormali, Hisham Alasmary et al.
The main goal of this study is to investigate the robustness of graph-based Deep Learning (DL) models used for Internet of Things (IoT) malware classification against Adversarial Learning (AL). We designed two approaches to craft adversarial IoT software, including Off-the-Shelf Adversarial Attack (OSAA) methods, using six different AL attack approaches, and Graph Embedding and Augmentation (GEA). The GEA approach aims to preserve the functionality and practicality of the generated adversarial sample through a careful embedding of a benign sample to a malicious one. Our evaluations demonstrate that OSAAs are able to achieve a misclassification rate (MR) of 100%. Moreover, we observed that the GEA approach is able to misclassify all IoT malware samples as benign.
CRFeb 11, 2019
Analyzing, Comparing, and Detecting Emerging Malware: A Graph-based ApproachHisham Alasmary, Aminollah Khormali, Afsah Anwar et al.
The growth in the number of Android and Internet of Things (IoT) devices has witnessed a parallel increase in the number of malicious software (malware), calling for new analysis approaches. We represent binaries using their graph properties of the Control Flow Graph (CFG) structure and conduct an in-depth analysis of malicious graphs extracted from the Android and IoT malware to understand their differences. Using 2,874 and 2,891 malware binaries corresponding to IoT and Android samples, we analyze both general characteristics and graph algorithmic properties. Using the CFG as an abstract structure, we then emphasize various interesting findings, such as the prevalence of unreachable code in Android malware, noted by the multiple components in their CFGs, and larger number of nodes in the Android malware, compared to the IoT malware, highlighting a higher order of complexity. We implement a Machine Learning based classifiers to detect IoT malware from benign ones, and achieved an accuracy of 97.9% using Random Forests (RF).
NIFeb 10, 2019
Exploring Spatial, Temporal, and Logical Attacks on the Bitcoin NetworkMuhammad Saad, Victor Cook, Lan Nguyen et al.
In this paper, we explore the partitioning attacks on the Bitcoin network, which is shown to exhibit spatial bias, and temporal and logical diversity. Through data-driven study we highlight: 1) the centralization of Bitcoin nodes across autonomous systems, indicating the possibility of BGP attacks, 2)the non-uniform consensus among nodes, that can be exploited to partition the network, and 3)the diversity in the Bitcoin software usage that can lead to privacy attacks. Atop the prior work, which focused on spatial partitioning, our work extends the analysis of the Bitcoin network to understand the temporal and logical effects on the robustness of the Bitcoin network.
CRJan 4, 2019
Network-based Analysis and Classification of Malware using Behavioral Artifacts OrderingAziz Mohaisen, Omar Alrawi, Jeman Park et al.
Using runtime execution artifacts to identify malware and its associated family is an established technique in the security domain. Many papers in the literature rely on explicit features derived from network, file system, or registry interaction. While effective, the use of these fine-granularity data points makes these techniques computationally expensive. Moreover, the signatures and heuristics are often circumvented by subsequent malware authors. In this work, we propose Chatter, a system that is concerned only with the order in which high-level system events take place. Individual events are mapped onto an alphabet and execution traces are captured via terse concatenations of those letters. Then, leveraging an analyst labeled corpus of malware, n-gram document classification techniques are applied to produce a classifier predicting malware family. This paper describes that technique and its proof-of-concept evaluation. In its prototype form, only network events are considered and eleven malware families are used. We show the technique achieves 83%-94% accuracy in isolation and makes non-trivial performance improvements when integrated with a baseline classifier of combined order features to reach an accuracy of up to 98.8%.
CRNov 25, 2018
Towards Blockchain-Driven, Secure and Transparent Audit LogsAshar Ahmad, Muhammad Saad, Mostafa Bassiouni et al.
Audit logs serve as a critical component in the enterprise business systems that are used for auditing, storing, and tracking changes made to the data. However, audit logs are vulnerable to a series of attacks, which enable adversaries to tamper data and corresponding audit logs. In this paper, we present BlockAudit: a scalable and tamper-proof system that leverages the design properties of audit logs and security guarantees of blockchains to enable secure and trustworthy audit logs. Towards that, we construct the design schema of BlockAudit, and outline its operational procedures. We implement our design on Hyperledger and evaluate its performance in terms of latency, network size, and payload size. Our results show that conventional audit logs can seamlessly transition into BlockAudit to achieve higher security, integrity, and fault tolerance.
CRNov 25, 2018
Countering Selfish Mining in BlockchainsMuhammad Saad, Laurent Njilla, Charles Kamhoua et al.
Selfish mining is a well known vulnerability in blockchains exploited by miners to steal block rewards. In this paper, we explore a new form of selfish mining attack that guarantees high rewards with low cost. We show the feasibility of this attack facilitated by recent developments in blockchain technology opening new attack avenues. By outlining the limitations of existing countermeasures, we highlight a need for new defense strategies to counter this attack, and leverage key system parameters in blockchain applications to propose an algorithm that enforces fair mining. We use the expected transaction confirmation height and block publishing height to detect selfish mining behavior and develop a network-wide defense mechanism to disincentivize selfish miners. Our design involves a simple modifications to transactions' data structure in order to obtain a "truth state" used to catch the selfish miners and prevent honest miners from losing block rewards.
CRSep 6, 2018
End-to-End Analysis of In-Browser CryptojackingMuhammad Saad, Aminollah Khormali, Aziz Mohaisen
In-browser cryptojacking involves hijacking the CPU power of a website's visitor to perform CPU-intensive cryptocurrency mining, and has been on the rise, with 8500% growth during 2017. While some websites advocate cryptojacking as a replacement for online advertisement, web attackers exploit it to generate revenue by embedding malicious cryptojacking code in highly ranked websites. Motivated by the rise of cryptojacking and the lack of any prior systematic work, we set out to analyze malicious cryptojacking statically and dynamically, and examine the economical basis of cryptojacking as an alternative to advertisement. For our static analysis, we perform content-, currency-, and code-based analyses. Through the content-based analysis, we unveil that cryptojacking is a wide-spread threat targeting a variety of website types. Through a currency-based analysis we highlight affinities between mining platforms and currencies: the majority of cryptojacking websites use Coinhive to mine Monero. Through code-based analysis, we highlight unique code complexity features of cryptojacking scripts, and use them to detect cryptojacking code among benign and other malicious JavaScript code, with an accuracy of 96.4%. Through dynamic analysis, we highlight the impact of cryptojacking on system resources, such as CPU and battery consumption (in battery-powered devices); we use the latter to build an analytical model that examines the feasibility of cryptojacking as an alternative to online advertisement, and show a huge negative profit/loss gap, suggesting that the model is impractical. By surveying existing countermeasures and their limitations, we conclude with long-term countermeasures using insights from our analysis.
CRFeb 2, 2017
Beyond Free Riding: Quality of Indicators for Assessing Participation in Information Sharing for Threat IntelligenceOmar Al-Ibrahim, Aziz Mohaisen, Charles Kamhoua et al.
Threat intelligence sharing has become a growing concept, whereby entities can exchange patterns of threats with each other, in the form of indicators, to a community of trust for threat analysis and incident response. However, sharing threat-related information have posed various risks to an organization that pertains to its security, privacy, and competitiveness. Given the coinciding benefits and risks of threat information sharing, some entities have adopted an elusive behavior of "free-riding" so that they can acquire the benefits of sharing without contributing much to the community. So far, understanding the effectiveness of sharing has been viewed from the perspective of the amount of information exchanged as opposed to its quality. In this paper, we introduce the notion of quality of indicators (\qoi) for the assessment of the level of contribution by participants in information sharing for threat intelligence. We exemplify this notion through various metrics, including correctness, relevance, utility, and uniqueness of indicators. In order to realize the notion of \qoi, we conducted an empirical study and taken a benchmark approach to define quality metrics, then we obtained a reference dataset and utilized tools from the machine learning literature for quality assessment. We compared these results against a model that only considers the volume of information as a metric for contribution, and unveiled various interesting observations, including the ability to spot low quality contributions that are synonym to free riding in threat information sharing.
CRFeb 2, 2017
Rethinking Information Sharing for Actionable Threat IntelligenceAziz Mohaisen, Omar Al-Ibrahim, Charles Kamhoua et al.
In the past decade, the information security and threat landscape has grown significantly making it difficult for a single defender to defend against all attacks at the same time. This called for introduc- ing information sharing, a paradigm in which threat indicators are shared in a community of trust to facilitate defenses. Standards for representation, exchange, and consumption of indicators are pro- posed in the literature, although various issues are undermined. In this paper, we rethink information sharing for actionable intelli- gence, by highlighting various issues that deserve further explo- ration. We argue that information sharing can benefit from well- defined use models, threat models, well-understood risk by mea- surement and robust scoring, well-understood and preserved pri- vacy and quality of indicators and robust mechanism to avoid free riding behavior of selfish agent. We call for using the differential nature of data and community structures for optimizing sharing.
CRJun 22, 2016
Domain Name System Security and Privacy: Old Problems and New ChallengesAh Reum Kang, Jeffrey Spaulding, Aziz Mohaisen
The domain name system (DNS) is an important protocol in today's Internet operation, and is the standard naming convention between domain names, names that are easy to read, understand, and remember by humans, to IP address of Internet resources. The wealth of research activities on DNS in general and security and privacy in particular suggest that all problems in this domain are solved. Reality however is that despite the large body of literature on various aspects of DNS, there are still many challenges that need to be addressed. In this paper, we review the various activities in the research community on DNS operation, security, and privacy, and outline various challenges and open research directions that need to be tackled.
CRJun 6, 2016
Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call GraphJae-wook Jang, Jiyoung Woo, Aziz Mohaisen et al.
As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort, and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that influence-based graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.
CYJun 4, 2016
Multimodal Game Bot Detection using User Behavioral CharacteristicsAh Reum Kang, Seong Hoon Jeong, Aziz Mohaisen et al.
As the online service industry has continued to grow, illegal activities in the online world have drastically increased and become more diverse. Most illegal activities occur continuously because cyber assets, such as game items and cyber money in online games, can be monetized into real currency. The aim of this study is to detect game bots in a Massively Multiplayer Online Role Playing Game (MMORPG). We observed the behavioral characteristics of game bots and found that they execute repetitive tasks associated with gold farming and real money trading. We propose a game bot detection methodology based on user behavioral characteristics. The methodology of this paper was applied to real data provided by a major MMORPG company. Detection accuracy rate increased to 96.06% on the banned account list.
CRJun 4, 2016
Andro-profiler: Detecting and Classifying Android Malware based on Behavioral ProfilesJae-wook Jang, Jaesung Yun, Aziz Mohaisen et al.
Mass-market mobile security threats have increased recently due to the growth of mobile technologies and the popularity of mobile devices. Accordingly, techniques have been introduced for identifying, classifying, and defending against mobile threats utilizing static, dynamic, on-device, off-device, and hybrid approaches. In this paper, we contribute to the mobile security defense posture by introducing Andro-profiler, a hybrid behavior based analysis and classification system for mobile malware. Andro-profiler classifies malware by exploiting the behavior profiling extracted from the integrated system logs including system calls, which are implicitly equivalent to distinct behavior characteristics. Andro-profiler executes a malicious application on an emulator in order to generate the integrated system logs, and creates human-readable behavior profiles by analyzing the integrated system logs. By comparing the behavior profile of malicious application with representative behavior profile for each malware family, Andro-profiler detects and classifies it into malware families. The experiment results demonstrate that Andro-profiler is scalable, performs well in detecting and classifying malware with accuracy greater than $98\%$, outperforms the existing state-of-the-art work, and is capable of identifying zero-day mobile malware samples.
CRMar 9, 2016
The Landscape of Domain Name Typosquatting: Techniques and CountermeasuresJeffrey Spaulding, Shambhu Upadhyaya, Aziz Mohaisen
With more than 294 million registered domain names as of late 2015, the domain name ecosystem has evolved to become a cornerstone for the operation of the Internet. Domain names today serve everyone, from individuals for their online presence to big brands for their business operations. Such ecosystem that facilitated legitimate business and personal uses has also fostered "creative" cases of misuse, including phishing, spam, hit and traffic stealing, online scams, among others. As a first step towards this misuse, the registration of a legitimately-looking domain is often required. For that, domain typosquatting provides a great avenue to cybercriminals to conduct their crimes. In this paper, we review the landscape of domain name typosquatting, highlighting models and advanced techniques for typosquatted domain names generation, models for their monetization, and the existing literature on countermeasures. We further highlight potential fruitful directions on technical countermeasures that are lacking in the literature.
CRDec 22, 2013
The Sybil Attacks and Defenses: A SurveyAziz Mohaisen, Joongheon Kim
In this paper we have a close look at the Sybil attack and advances in defending against it, with particular emphasis on the recent work. We identify three major veins of literature work to defend against the attack: using trusted certification, using resources testing, and using social networks. The first vein of literature considers defending against the attack using trusted certification, which is done by either centralized certification or distributed certification using cryptographic primitives that can replace the centralized certification entity. The second vein of literature considers defending against the attack by resources testing, which can by in the form of IP testing, network coordinates, recurring cost as by requiring clients to solve puzzles. The third and last vein of literature is by mitigating the attack combining social networks used as bootstrapping security and tools from random walk theory that have shown to be effective in defending against the attack under certain assumptions. Our survey and analyses of the different schemes in the three veins of literature show several shortcomings which form several interesting directions and research questions worthy of investigation.