Yafei Wu

3papers

56citations

Novelty50%

AI Score24

Ranked #177,077 of 201,326 authors (top 88%)#5,441 in CR (top 75%)

3 Papers

CRJan 30, 2022

DeepCatra: Learning Flow- and Graph-based Behaviors for Android Malware Detection

Yafei Wu, Jian Shi, Peicheng Wang et al.

As Android malware is growing and evolving, deep learning has been introduced into malware detection, resulting in great effectiveness. Recent work is considering hybrid models and multi-view learning. However, they use only simple features, limiting the accuracy of these approaches in practice. In this paper, we propose DeepCatra, a multi-view learning approach for Android malware detection, whose model consists of a bidirectional LSTM (BiLSTM) and a graph neural network (GNN) as subnets. The two subnets rely on features extracted from statically computed call traces leading to critical APIs derived from public vulnerabilities. For each Android app, DeepCatra first constructs its call graph and computes call traces reaching critical APIs. Then, temporal opcode features used by the BiLSTM subnet are extracted from the call traces, while flow graph features used by the GNN subnet are constructed from all the call traces and inter-component communications. We evaluate the effectiveness of DeepCatra by comparing it with several state-of-the-art detection approaches. Experimental results on over 18,000 real-world apps and prevalent malware show that DeepCatra achieves considerable improvement, e.g., 2.7% to 14.6% on F1-measure, which demonstrates the feasibility of DeepCatra in practice.

CRDec 13, 2021

$μ$Dep: Mutation-based Dependency Generation for Precise Taint Analysis on Android Native Code

Cong Sun, Yuwan Ma, Dongrui Zeng et al.

The existence of native code in Android apps plays an important role in triggering inconspicuous propagation of secrets and circumventing malware detection. However, the state-of-the-art information-flow analysis tools for Android apps all have limited capabilities of analyzing native code. Due to the complexity of binary-level static analysis, most static analyzers choose to build conservative models for a selected portion of native code. Though the recent inter-language analysis improves the capability of tracking information flow in native code, it is still far from attaining similar effectiveness of the state-of-the-art information-flow analyzers that focus on non-native Java methods. To overcome the above constraints, we propose a new analysis framework, $μ$Dep, to detect sensitive information flows of the Android apps containing native code. In this framework, we combine a control-flow based static binary analysis with a mutation-based dynamic analysis to model the tainting behaviors of native code in the apps. Based on the result of the analyses, $μ$Dep conducts a stub generation for the related native functions to facilitate the state-of-the-art analyzer DroidSafe with fine-grained tainting behavior summaries of native code. The experimental results show that our framework is competitive on the accuracy, and effective in analyzing the information flows in real-world apps and malware compared with the state-of-the-art inter-language static analysis.

CRDec 12, 2021

CryptoEval: Evaluating the Risk of Cryptographic Misuses in Android Apps with Data-Flow Analysis

Cong Sun, Xinpeng Xu, Yafei Wu et al.

The misunderstanding and incorrect configurations of cryptographic primitives have exposed severe security vulnerabilities to attackers. Due to the pervasiveness and diversity of cryptographic misuses, a comprehensive and accurate understanding of how cryptographic misuses can undermine the security of an Android app is critical to the subsequent mitigation strategies but also challenging. Although various approaches have been proposed to detect cryptographic misuse in Android apps, studies have yet to focus on estimating the security risks of cryptographic misuse. To address this problem, we present an extensible framework for deciding the threat level of cryptographic misuse in Android apps. Firstly, we propose a general and unified specification for representing cryptographic misuses to make our framework extensible and develop adapters to unify the detection results of the state-of-the-art cryptographic misuse detectors, resulting in an adapter-based detection tool chain for a more comprehensive list of cryptographic misuses. Secondly, we employ a misuse-originating data-flow analysis to connect each cryptographic misuse to a set of data-flow sinks in an app, based on which we propose a quantitative data-flow-driven metric for assessing the overall risk of the app introduced by cryptographic misuses. To make the per-app assessment more useful for app vetting at the app-store level, we apply unsupervised learning to predict and classify the top risky threats to guide more efficient subsequent mitigation. In the experiments on an instantiated implementation of the framework, we evaluate the accuracy of our detection and the effect of data-flow-driven risk assessment of our framework. Our empirical study on over 40,000 apps and the analysis of popular apps reveal important security observations on the real threats of cryptographic misuse in Android apps.