Danfeng Yao

5papers

106citations

Novelty44%

AI Score39

Ranked #100,976 of 205,806 authors (top 49%)#2,570 in CR (top 35%)

5 Papers

62.7CRMay 5

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

Shravya Kanchi, Xiaoyan Zang, Ying Zhang et al.

Developers create modern software applications (Apps) on top of third-party libraries (Libs). When library vulnerabilities are reachable through application code, the applications can be vulnerable to software supply chain attacks. Prior work shows that developers often require concrete and executable evidence, i.e., proof-of-vulnerability (PoV) tests, to decide whether a reported dependency vulnerability poses a practical security risk to their application. However, manually crafting such tests is challenging, and existing tool support is insufficient to automate the procedure. To streamline test generation, we created PoVSmith -- a new approach that combines call path analysis, exemplar test, code context, and feedback into multiple prompts to guide a coding agent (i.e., Codex) and a large language model (i.e., GPT) for test generation, execution, and assessment. We evaluated PoVSmith on 33 $\langle$App, Lib$\rangle$ Java program pairs, where each App depends on a vulnerable Lib. PoVSmith revealed 158 unique application-level entry points (i.e., public methods) calling vulnerable library APIs; 152 (96\%) of them were correctly found, together with the call paths properly recognized. With such method call information, PoVSmith generated 152 tests, 84 (55\%) of which demonstrated feasible ways of attacking Apps by exploiting Lib vulnerabilities. PoVSmith substantially outperforms the state-of-the-art LLM-based approach, as it reduces human involvement while dramatically improving test quality. Our work contributes (1) a novel approach of agent-based test generation, (2) an iterative code refinement process driven by execution feedback, and (3) LLM-based quality assessment grounded in both the test context and execution logs.

SEMar 15, 2021

Embedding Code Contexts for Cryptographic API Suggestion:New Methodologies and Comparisons

Ya Xiao, Salman Ahmed, Wenjia Song et al.

Despite recent research efforts, the vision of automatic code generation through API recommendation has not been realized. Accuracy and expressiveness challenges of API recommendation needs to be systematically addressed. We present a new neural network-based approach, Multi-HyLSTM for API recommendation --targeting cryptography-related code. Multi-HyLSTM leverages program analysis to guide the API embedding and recommendation. By analyzing the data dependence paths of API methods, we train embedding and specialize a multi-path neural network architecture for API recommendation tasks that accurately predict the next API method call. We address two previously unreported programming language-specific challenges, differentiating functionally similar APIs and capturing low-frequency long-range influences. Our results confirm the effectiveness of our design choices, including program-analysis-guided embedding, multi-path code suggestion architecture, and low-frequency long-range-enhanced sequence learning, with high accuracy on top-1 recommendations. We achieve a top-1 accuracy of 91.41% compared with 77.44% from the state-of-the-art tool SLANG. In an analysis of 245 test cases, compared with the commercial tool Codota, we achieve a top-1 recommendation accuracy of 88.98%, which is significantly better than Codota's accuracy of 64.90%. We publish our data and code as a large Java cryptographic code dataset.

CRFeb 22, 2019

Exploitation Techniques and Defenses for Data-Oriented Attacks

Long Cheng, Hans Liljestrand, Thomas Nyman et al.

Data-oriented attacks manipulate non-control data to alter a program's benign behavior without violating its control-flow integrity. It has been shown that such attacks can cause significant damage even in the presence of control-flow defense mechanisms. However, these threats have not been adequately addressed. In this SoK paper, we first map data-oriented exploits, including Data-Oriented Programming (DOP) attacks, to their assumptions/requirements and attack capabilities. We also compare known defenses against these attacks, in terms of approach, detection capabilities, overhead, and compatibility. Then, we experimentally assess the feasibility of a detection approach that is based on the Intel Processor Trace (PT) technology. PT only traces control flows, thus, is generally believed to be not useful for data-oriented security. However, our work reveals that data-oriented attacks (in particular the recent DOP attacks) may generate side-effects on control-flow behavior in multiple dimensions, which manifest in PT traces. Based on this evaluation, we discuss challenges for building deployable data-oriented defenses and open research questions.

CRApr 30, 2018

Checking is Believing: Event-Aware Program Anomaly Detection in Cyber-Physical Systems

Long Cheng, Ke Tian, Danfeng Yao et al.

Securing cyber-physical systems (CPS) against malicious attacks is of paramount importance because these attacks may cause irreparable damages to physical systems. Recent studies have revealed that control programs running on CPS devices suffer from both control-oriented attacks (e.g., code-injection or code-reuse attacks) and data-oriented attacks (e.g., non-control data attacks). Unfortunately, existing detection mechanisms are insufficient to detect runtime data-oriented exploits, due to the lack of runtime execution semantics checking. In this work, we propose Orpheus, a new security methodology for defending against data-oriented attacks by enforcing cyber-physical execution semantics. We first present a general method for reasoning cyber-physical execution semantics of a control program (i.e., causal dependencies between the physical context and program control flows), including the event identification and dependence analysis. As an instantiation of Orpheus, we then present a new program behavior model, i.e., the event-aware finite-state automaton (eFSA). eFSA takes advantage of the event-driven nature of CPS control programs and incorporates event checking in anomaly detection. It detects data-oriented exploits if a specific physical event is missing along with the corresponding event dependent state transition. We evaluate our prototype's performance by conducting case studies under data-oriented attacks. Results show that eFSA can successfully detect different runtime attacks. Our prototype on Raspberry Pi incurs a low overhead, taking 0.0001s for each state transition integrity checking, and 0.063s~0.211s for the cyber-physical contextual consistency checking.

CRJan 18, 2017

Breaking the Target: An Analysis of Target Data Breach and Lessons Learned

Xiaokui Shu, Ke Tian, Andrew Ciambrone et al.

This paper investigates and examines the events leading up to the second most devastating data breach in history: the attack on the Target Corporation. It includes a thorough step-by-step analysis of this attack and a comprehensive anatomy of the malware named BlackPOS. Also, this paper provides insight into the legal aspect of cybercrimes, along with a prosecution and sentence example of the well-known TJX case. Furthermore, we point out an urgent need for improving security mechanisms in existing systems of merchants and propose three security guidelines and defenses. Credit card security is discussed at the end of the paper with several best practices given to customers to hide their card information in purchase transactions.