Rocky K. C. Chang

CR
h-index13
5papers
69citations
Novelty65%
AI Score38

5 Papers

CRJul 5, 2025
Rethinking and Exploring String-Based Malware Family Classification in the Era of LLMs and RAG

Yufan Chen, Daoyuan Wu, Juantao Zhong et al.

Malware family classification aims to identify the specific family (e.g., GuLoader or BitRAT) a malware sample may belong to, in contrast to malware detection or sample classification, which only predicts a Yes/No outcome. Accurate family identification can greatly facilitate automated sample labeling and understanding on crowdsourced malware analysis platforms such as VirusTotal and MalwareBazaar, which generate vast amounts of data daily. In this paper, we explore and assess the feasibility of using traditional binary string features for family classification in the new era of large language models (LLMs) and Retrieval-Augmented Generation (RAG). Specifically, we investigate howFamily-Specific String (FSS) features can be utilized in a manner similar to RAG to facilitate family classification. To this end, we develop a curated evaluation framework covering 4,347 samples from 67 malware families, extract and analyze over 25 million strings, and conduct detailed ablation studies to assess the impact of different design choices in four major modules, with each providing a relative improvement ranging from 8.1% to 120%.

CRMay 23, 2020
When Program Analysis Meets Bytecode Search: Targeted and Efficient Inter-procedural Analysis of Modern Android Apps in BackDroid

Daoyuan Wu, Debin Gao, Robert H. Deng et al.

Widely-used Android static program analysis tools, e.g., Amandroid and FlowDroid, perform the whole-app inter-procedural analysis that is comprehensive but fundamentally difficult to handle modern (large) apps. The average app size has increased three to four times over five years. In this paper, we explore a new paradigm of targeted inter-procedural analysis that can skip irrelevant code and focus only on the flows of security-sensitive sink APIs. To this end, we propose a technique called on-the-fly bytecode search, which searches the disassembled app bytecode text just in time when a caller needs to be located. In this way, it guides targeted (and backward) inter-procedural analysis step by step until reaching entry points, without relying on a whole-app graph. Such search-based inter-procedural analysis, however, is challenging due to Java polymorphism, callbacks, asynchronous flows, static initializers, and inter-component communication in Android apps. We overcome these unique obstacles in our context by proposing a set of bytecode search mechanisms that utilize flexible searches and forward object taint analysis. Atop of this new inter-procedural analysis, we further adjust the traditional backward slicing and forward constant propagation to provide the complete dataflow tracking of sink API calls. We have implemented a prototype called BackDroid and compared it with Amandroid in analyzing 3,178 modern popular apps for crypto and SSL misconfigurations. The evaluation shows that for such sink-based problems, BackDroid is 37 times faster (2.13 v.s. 78.15 minutes) and has no timed-out failure (v.s. 35% in Amandroid), while maintaining close or even better detection effectiveness.

CROct 31, 2015
Cross-Platform Analysis of Indirect File Leaks in Android and iOS Applications

Daoyuan Wu, Rocky K. C. Chang

Today, much of our sensitive information is stored inside mobile applications (apps), such as the browsing histories and chatting logs. To safeguard these privacy files, modern mobile systems, notably Android and iOS, use sandboxes to isolate apps' file zones from one another. However, we show in this paper that these private files can still be leaked by indirectly exploiting components that are trusted by the victim apps. In particular, we devise new indirect file leak (IFL) attacks that exploit browser interfaces, command interpreters, and embedded app servers to leak data from very popular apps, such as Evernote and QQ. Unlike the previous attacks, we demonstrate that these IFLs can affect both Android and iOS. Moreover, our IFL methods allow an adversary to launch the attacks remotely, without implanting malicious apps in victim's smartphones. We finally compare the impacts of four different types of IFL attacks on Android and iOS, and propose several mitigation methods.

CRMay 24, 2014
A Sink-driven Approach to Detecting Exposed Component Vulnerabilities in Android Apps

Daoyuan Wu, Xiapu Luo, Rocky K. C. Chang

Android apps could expose their components for cooperating with other apps. This convenience, however, makes apps susceptible to the exposed component vulnerability (ECV), in which a dangerous API (commonly known as sink) inside its component can be triggered by other (malicious) apps. In the prior works, detecting these ECVs use a set of sinks pertaining to the ECVs under detection. In this paper, we argue that a more comprehensive and effective approach should start by a systematic selection and classification of vulnerability-specific sinks (VSinks). The set of VSinks is much larger than those used in the previous works. Based on these VSinks, our sink-driven approach can detect different kinds of ECVs in an app in two steps. First, VSinks and their categories are identified through a typical forward reachability analysis. Second, based on each VSink's category, a corresponding detection method is used to identify the ECV via a customized backward dataflow analysis. We also design a semi-auto guided analysis and validation capability for system-only broadcast checking to remove some false positives. We implement our sink-driven approach in a tool called ECVDetector and evaluate it with the top 1K Android apps. Using ECVDetector we successfully identify a total of 49 vulnerable apps across all four ECV categories we have defined. To our knowledge, most of them are previously undisclosed, such as the very popular Go SMS Pro and Clean Master. Moreover, the performance of ECVDetector is high, requiring only 9.257 seconds on average to process each component.

CRApr 17, 2014
Analyzing Android Browser Apps for file:// Vulnerabilities

Daoyuan Wu, Rocky K. C. Chang

Securing browsers in mobile devices is very challenging, because these browser apps usually provide browsing services to other apps in the same device. A malicious app installed in a device can potentially obtain sensitive information through a browser app. In this paper, we identify four types of attacks in Android, collectively known as FileCross, that exploits the vulnerable file:// to obtain users' private files, such as cookies, bookmarks, and browsing histories. We design an automated system to dynamically test 115 browser apps collected from Google Play and find that 64 of them are vulnerable to the attacks. Among them are the popular Firefox, Baidu and Maxthon browsers, and the more application-specific ones, including UC Browser HD for tablet users, Wikipedia Browser, and Kids Safe Browser. A detailed analysis of these browsers further shows that 26 browsers (23%) expose their browsing interfaces unintentionally. In response to our reports, the developers concerned promptly patched their browsers by forbidding file:// access to private file zones, disabling JavaScript execution in file:// URLs, or even blocking external file:// URLs. We employ the same system to validate the ten patches received from the developers and find one still failing to block the vulnerability.