Hui Wen

CR
4papers
245citations
Novelty43%
AI Score28

4 Papers

CRMay 6, 2024Code
When LLMs Meet Cybersecurity: A Systematic Literature Review

Jie Zhang, Haoyu Bu, Hui Wen et al.

The rapid development of large language models (LLMs) has opened new avenues across various fields, including cybersecurity, which faces an evolving threat landscape and demand for innovative technologies. Despite initial explorations into the application of LLMs in cybersecurity, there is a lack of a comprehensive overview of this research area. This paper addresses this gap by providing a systematic literature review, covering the analysis of over 300 works, encompassing 25 LLMs and more than 10 downstream scenarios. Our comprehensive overview addresses three key research questions: the construction of cybersecurity-oriented LLMs, the application of LLMs to various cybersecurity tasks, the challenges and further research in this area. This study aims to shed light on the extensive potential of LLMs in enhancing cybersecurity practices and serve as a valuable resource for applying LLMs in this field. We also maintain and regularly update a list of practical guides on LLMs for cybersecurity at https://github.com/tmylla/Awesome-LLM4Cybersecurity.

CRJul 21, 2021
Firmware Re-hosting Through Static Binary-level Porting

Mingfeng Xin, Hui Wen, Liting Deng et al.

The rapid growth of the Industrial Internet of Things (IIoT) has brought embedded systems into focus as major targets for both security analysts and malicious adversaries. Due to the non-standard hardware and diverse software, embedded devices present unique challenges to security analysts for the accurate analysis of firmware binaries. The diversity in hardware components and tight coupling between firmware and hardware makes it hard to perform dynamic analysis, which must have the ability to execute firmware code in virtualized environments. However, emulating the large expanse of hardware peripherals makes analysts have to frequently modify the emulator for executing various firmware code in different virtualized environments, greatly limiting the ability of security analysis. In this work, we explore the problem of firmware re-hosting related to the real-time operating system (RTOS). Specifically, developers create a Board Support Package (BSP) and develop device drivers to make that RTOS run on their platform. By providing high-level replacements for BSP routines and device drivers, we can make the minimal modification of the firmware that is to be migrated from its original hardware environment into a virtualized one. We show that an approach capable of offering the ability to execute firmware at scale through patching firmware in an automated manner without modifying the existing emulators. Our approach, called static binary-level porting, first identifies the BSP and device drivers in target firmware, then patches the firmware with pre-built BSP routines and drivers that can be adapted to the existing emulators. Finally, we demonstrate the practicality of the proposed method on multiple hardware platforms and firmware samples for security analysis. The result shows that the approach is flexible enough to emulate firmware for vulnerability assessment and exploits development.

CVAug 11, 2020
Transferring Inter-Class Correlation

Hui Wen, Yue Wu, Chenming Yang et al.

The Teacher-Student (T-S) framework is widely utilized in the classification tasks, through which the performance of one neural network (the student) can be improved by transferring knowledge from another trained neural network (the teacher). Since the transferring knowledge is related to the network capacities and structures between the teacher and the student, how to define efficient knowledge remains an open question. To address this issue, we design a novel transferring knowledge, the Self-Attention based Inter-Class Correlation (ICC) map in the output layer, and propose our T-S framework, Inter-Class Correlation Transfer (ICCT).

CLMay 5, 2017
Joint RNN Model for Argument Component Boundary Detection

Minglan Li, Yang Gao, Hui Wen et al.

Argument Component Boundary Detection (ACBD) is an important sub-task in argumentation mining; it aims at identifying the word sequences that constitute argument components, and is usually considered as the first sub-task in the argumentation mining pipeline. Existing ACBD methods heavily depend on task-specific knowledge, and require considerable human efforts on feature-engineering. To tackle these problems, in this work, we formulate ACBD as a sequence labeling problem and propose a variety of Recurrent Neural Network (RNN) based methods, which do not use domain specific or handcrafted features beyond the relative position of the sentence in the document. In particular, we propose a novel joint RNN model that can predict whether sentences are argumentative or not, and use the predicted results to more precisely detect the argument component boundaries. We evaluate our techniques on two corpora from two different genres; results suggest that our joint RNN model obtain the state-of-the-art performance on both datasets.