CRJul 24, 2023
How Does Naming Affect LLMs on Code Analysis Tasks?Zhilong Wang, Lan Zhang, Chen Cao et al.
The Large Language Models (LLMs), such as GPT and BERT, were proposed for natural language processing (NLP) and have shown promising results as general-purpose language models. An increasing number of industry professionals and researchers are adopting LLMs for program analysis tasks. However, one significant difference between programming languages and natural languages is that a programmer has the flexibility to assign any names to variables, methods, and functions in the program, whereas a natural language writer does not. Intuitively, the quality of naming in a program affects the performance of LLMs in program analysis tasks. This paper investigates how naming affects LLMs on code analysis tasks. Specifically, we create a set of datasets with code containing nonsense or misleading names for variables, methods, and functions, respectively. We then use well-trained models (CodeBERT) to perform code analysis tasks on these datasets. The experimental results show that naming has a significant impact on the performance of code analysis tasks based on LLMs, indicating that code representation learning based on LLMs heavily relies on well-defined names in code. Additionally, we conduct a case study on some special code analysis tasks using GPT, providing further insights.
CRAug 20, 2024
Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier ArticlesZhilong Wang, Haizhou Wang, Nanqing Luo et al.
Large Language Model (LLM) jailbreak refers to a type of attack aimed to bypass the safeguard of an LLM to generate contents that are inconsistent with the safe usage guidelines. Based on the insights from the self-attention computation process, this paper proposes a novel blackbox jailbreak approach, which involves crafting the payload prompt by strategically injecting the prohibited query into a carrier article. The carrier article maintains the semantic proximity to the prohibited query, which is automatically produced by combining a hypernymy article and a context, both of which are generated from the prohibited query. The intuition behind the usage of carrier article is to activate the neurons in the model related to the semantics of the prohibited query while suppressing the neurons that will trigger the objectionable text. Carrier article itself is benign, and we leveraged prompt injection techniques to produce the payload prompt. We evaluate our approach using JailbreakBench, testing against four target models across 100 distinct jailbreak objectives. The experimental results demonstrate our method's superior effectiveness, achieving an average success rate of 63% across all target models, significantly outperforming existing blackbox jailbreak methods.
CRApr 27
Detecting Avalanche Effect in Adversarial Settings: Spotting the Encryption Loops in RansomwareNanqing Luo, Xusheng Li, Haizhou Wang et al.
Spotting encryption loops in binary-only ransomware is a critical reverse engineering task. Since the existence of avalanche effect, an intrinsic characteristic of any secure encryption algorithms, is unavoidable during a victim data encryption attack, it is a very promising direction to spot encryption loops through avalanche effect detection. Unfortunately, no existing work in this direction ensures that the being-checked effect is the avalanche effect itself. Although CipherXRay is inspired by avalanche effect, it only checks whether a "ripple effect" (i.e., a necessary but non-sufficient condition) of avalanche effect exists, allowing a straightforward counterattack to succeed. In this work, we present a new approach that checks the avalanche effect itself. Because the detection is conducted in adversarial settings (e.g., the ransomware author may obfuscate the code), a viable approach must tolerate inaccurate input \& output identification and must be resilient to adversarial evasion. These challenges are addressed by a novel record-and-replay detection mechanism that takes advantage of the statistical guarantees provided by the Shapiro-Wilk normality test. The experimental results show that our approach achieves 0.0\% false negative rate and 1.1\% false positive rate. When our tool is employed to reverse engineer real-world ransomware samples, it succeeds in analyzing all the ransomware samples selected from ten representative families.
LGMar 12, 2024
Towards Independence Criterion in Machine Unlearning of Features and LabelsLing Han, Nanqing Luo, Hao Huang et al.
This work delves into the complexities of machine unlearning in the face of distributional shifts, particularly focusing on the challenges posed by non-uniform feature and label removal. With the advent of regulations like the GDPR emphasizing data privacy and the right to be forgotten, machine learning models face the daunting task of unlearning sensitive information without compromising their integrity or performance. Our research introduces a novel approach that leverages influence functions and principles of distributional independence to address these challenges. By proposing a comprehensive framework for machine unlearning, we aim to ensure privacy protection while maintaining model performance and adaptability across varying distributions. Our method not only facilitates efficient data removal but also dynamically adjusts the model to preserve its generalization capabilities. Through extensive experimentation, we demonstrate the efficacy of our approach in scenarios characterized by significant distributional shifts, making substantial contributions to the field of machine unlearning. This research paves the way for developing more resilient and adaptable unlearning techniques, ensuring models remain robust and accurate in the dynamic landscape of data privacy and machine learning.
CRNov 8, 2024
Unmasking the Shadows: Pinpoint the Implementations of Anti-Dynamic Analysis Techniques in Malware Using LLMHaizhou Wang, Nanqing Luo, Xusheng Li et al.
Sandboxes and other dynamic analysis processes are prevalent in malware detection systems nowadays to enhance the capability of detecting 0-day malware. Therefore, techniques of anti-dynamic analysis (TADA) are prevalent in modern malware samples, and sandboxes can suffer from false negatives and analysis failures when analyzing the samples with TADAs. In such cases, human reverse engineers will get involved in conducting dynamic analysis manually (i.e., debugging, patching), which in turn also gets obstructed by TADAs. In this work, we propose a Large Language Model (LLM) based workflow that can pinpoint the location of the TADA implementation in the code, to help reverse engineers place breakpoints used in debugging. Our evaluation shows that we successfully identified the locations of 87.80% known TADA implementations adopted from public repositories. In addition, we successfully pinpoint the locations of TADAs in 4 well-known malware samples that are documented in online malware analysis blogs.
LGMar 2, 2021
DPlis: Boosting Utility of Differentially Private Deep Learning via Randomized SmoothingWenxiao Wang, Tianhao Wang, Lun Wang et al.
Deep learning techniques have achieved remarkable performance in wide-ranging tasks. However, when trained on privacy-sensitive datasets, the model parameters may expose private information in training data. Prior attempts for differentially private training, although offering rigorous privacy guarantees, lead to much lower model performance than the non-private ones. Besides, different runs of the same training algorithm produce models with large performance variance. To address these issues, we propose DPlis--Differentially Private Learning wIth Smoothing. The core idea of DPlis is to construct a smooth loss function that favors noise-resilient models lying in large flat regions of the loss landscape. We provide theoretical justification for the utility improvements of DPlis. Extensive experiments also demonstrate that DPlis can effectively boost model quality and training stability under a given privacy budget.