CVOct 19, 2022Code
Multi-Granularity Cross-Modality Representation Learning for Named Entity Recognition on Social MediaPeipei Liu, Gaosheng Wang, Hong Li et al.
Named Entity Recognition (NER) on social media refers to discovering and classifying entities from unstructured free-form content, and it plays an important role for various applications such as intention understanding and user recommendation. With social media posts tending to be multimodal, Multimodal Named Entity Recognition (MNER) for the text with its accompanying image is attracting more and more attention since some textual components can only be understood in combination with visual information. However, there are two drawbacks in existing approaches: 1) Meanings of the text and its accompanying image do not match always, so the text information still plays a major role. However, social media posts are usually shorter and more informal compared with other normal contents, which easily causes incomplete semantic description and the data sparsity problem. 2) Although the visual representations of whole images or objects are already used, existing methods ignore either fine-grained semantic correspondence between objects in images and words in text or the objective fact that there are misleading objects or no objects in some images. In this work, we solve the above two problems by introducing the multi-granularity cross-modality representation learning. To resolve the first problem, we enhance the representation by semantic augmentation for each word in text. As for the second issue, we perform the cross-modality semantic interaction between text and vision at the different vision granularity to get the most effective multimodal guidance representation for every word. Experiments show that our proposed approach can achieve the SOTA or approximate SOTA performance on two benchmark datasets of tweets. The code, data and the best performing models are available at https://github.com/LiuPeiP-CS/IIE4MNER
LGJul 16, 2022
Neural modal ordinary differential equations: Integrating physics-based modeling with neural ordinary differential equations for modeling high-dimensional monitored structuresZhilu Lai, Wei Liu, Xudong Jian et al.
The order/dimension of models derived on the basis of data is commonly restricted by the number of observations, or in the context of monitored systems, sensing nodes. This is particularly true for structural systems (e.g., civil or mechanical structures), which are typically high-dimensional in nature. In the scope of physics-informed machine learning, this paper proposes a framework -- termed Neural Modal ODEs -- to integrate physics-based modeling with deep learning for modeling the dynamics of monitored and high-dimensional engineered systems. Neural Ordinary Differential Equations -- Neural ODEs are exploited as the deep learning operator. In this initiating exploration, we restrict ourselves to linear or mildly nonlinear systems. We propose an architecture that couples a dynamic version of variational autoencoders with physics-informed Neural ODEs (Pi-Neural ODEs). An encoder, as a part of the autoencoder, learns the abstract mappings from the first few items of observational data to the initial values of the latent variables, which drive the learning of embedded dynamics via physics-informed Neural ODEs, imposing a modal model structure on that latent space. The decoder of the proposed model adopts the eigenmodes derived from an eigen-analysis applied to the linearized portion of a physics-based model: a process implicitly carrying the spatial relationship between degrees-of-freedom (DOFs). The framework is validated on a numerical example, and an experimental dataset of a scaled cable-stayed bridge, where the learned hybrid model is shown to outperform a purely physics-based approach to modeling. We further show the functionality of the proposed scheme within the context of virtual sensing, i.e., the recovery of generalized response quantities in unmeasured DOFs from spatially sparse data.
MMOct 28, 2022
Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment AnalysisPeipei Liu, Xin Zheng, Hong Li et al.
Modality representation learning is an important problem for multimodal sentiment analysis (MSA), since the highly distinguishable representations can contribute to improving the analysis effect. Previous works of MSA have usually focused on multimodal fusion strategies, and the deep study of modal representation learning was given less attention. Recently, contrastive learning has been confirmed effective at endowing the learned representation with stronger discriminate ability. Inspired by this, we explore the improvement approaches of modality representation with contrastive learning in this study. To this end, we devise a three-stages framework with multi-view contrastive learning to refine representations for the specific objectives. At the first stage, for the improvement of unimodal representations, we employ the supervised contrastive learning to pull samples within the same class together while the other samples are pushed apart. At the second stage, a self-supervised contrastive learning is designed for the improvement of the distilled unimodal representations after cross-modal interaction. At last, we leverage again the supervised contrastive learning to enhance the fused multimodal representation. After all the contrast trainings, we next achieve the classification task based on frozen representations. We conduct experiments on three open datasets, and results show the advance of our model.
CRMar 15, 2022
Threat Detection for General Social Engineering Attack Using Machine Learning TechniquesZuoguang Wang, Yimo Ren, Hongsong Zhu et al.
This paper explores the threat detection for general Social Engineering (SE) attack using Machine Learning (ML) techniques, rather than focusing on or limited to a specific SE attack type, e.g. email phishing. Firstly, this paper processes and obtains more SE threat data from the previous Knowledge Graph (KG), and then extracts different threat features and generates new datasets corresponding with three different feature combinations. Finally, 9 types of ML models are created and trained using the three datasets, respectively, and their performance are compared and analyzed with 27 threat detectors and 270 times of experiments. The experimental results and analyses show that: 1) the ML techniques are feasible in detecting general SE attacks and some ML models are quite effective; ML-based SE threat detection is complementary with KG-based approaches; 2) the generated datasets are usable and the SE domain ontology proposed in previous work can dissect SE attacks and deliver the SE threat features, allowing it to be used as a data model for future research. Besides, more conclusions and analyses about the characteristics of different ML detectors and the datasets are discussed.
CLOct 19, 2022
CEntRE: A paragraph-level Chinese dataset for Relation Extraction among EnterprisesPeipei Liu, Hong Li, Zhiyu Wang et al.
Enterprise relation extraction aims to detect pairs of enterprise entities and identify the business relations between them from unstructured or semi-structured text data, and it is crucial for several real-world applications such as risk analysis, rating research and supply chain security. However, previous work mainly focuses on getting attribute information about enterprises like personnel and corporate business, and pays little attention to enterprise relation extraction. To encourage further progress in the research, we introduce the CEntRE, a new dataset constructed from publicly available business news data with careful human annotation and intelligent data processing. Extensive experiments on CEntRE with six excellent models demonstrate the challenges of our proposed dataset.
CLNov 3, 2025
"Give a Positive Review Only": An Early Investigation Into In-Paper Prompt Injection Attacks and Defenses for AI ReviewersQin Zhou, Zhexin Zhang, Zhi Li et al.
With the rapid advancement of AI models, their deployment across diverse tasks has become increasingly widespread. A notable emerging application is leveraging AI models to assist in reviewing scientific papers. However, recent reports have revealed that some papers contain hidden, injected prompts designed to manipulate AI reviewers into providing overly favorable evaluations. In this work, we present an early systematic investigation into this emerging threat. We propose two classes of attacks: (1) static attack, which employs a fixed injection prompt, and (2) iterative attack, which optimizes the injection prompt against a simulated reviewer model to maximize its effectiveness. Both attacks achieve striking performance, frequently inducing full evaluation scores when targeting frontier AI reviewers. Furthermore, we show that these attacks are robust across various settings. To counter this threat, we explore a simple detection-based defense. While it substantially reduces the attack success rate, we demonstrate that an adaptive attacker can partially circumvent this defense. Our findings underscore the need for greater attention and rigorous safeguards against prompt-injection threats in AI-assisted peer review.
CRApr 26
Breaking the Secret: Economic Interventions for Combating Collusion in Embodied Multi-Agent SystemsQi Liu, Xiaohui Chen, Zhihui Zhao et al.
Collusion among autonomous agents poses a critical security threat in embodied multi-agent systems (MAS), where coordinated behaviors can deviate from global objectives and lead to real-world consequences. Existing defenses, primarily based on identity control or post-hoc behavior analysis, are insufficient to address such threats in embodied settings due to delayed feedback and noisy observations in physical environments, which make behavioral deviations difficult to detect accurately and in a timely manner. To address this challenge, we propose a mutagenic incentive intervention approach that mitigates collusion by reshaping agents' payoff structures. By rewarding agents who report collusive behavior and penalizing identified participants, the mechanism induces strategic defection and renders collusion unstable. We further design supporting mechanisms, including reporting deposits, smart contract-based reward enforcement, and encrypted communication, to ensure robustness against misuse of the incentive mechanism and retaliation from penalized agents. We implement the proposed approach in both simulated and real-world embodied environments. Experimental results show that our method effectively suppresses collusion by inducing defection, while preserving system efficiency. It achieves performance comparable to the non-collusion baseline and outperforms representative reactive defenses, thereby fulfilling the desired security objectives. These results demonstrate the effectiveness of proactive incentive design as a practical paradigm for securing embodied multi-agent systems.
CRJan 9, 2025
RAG-WM: An Efficient Black-Box Watermarking Approach for Retrieval-Augmented Generation of Large Language ModelsPeizhuo Lv, Mengjie Sun, Hao Wang et al.
In recent years, tremendous success has been witnessed in Retrieval-Augmented Generation (RAG), widely used to enhance Large Language Models (LLMs) in domain-specific, knowledge-intensive, and privacy-sensitive tasks. However, attackers may steal those valuable RAGs and deploy or commercialize them, making it essential to detect Intellectual Property (IP) infringement. Most existing ownership protection solutions, such as watermarks, are designed for relational databases and texts. They cannot be directly applied to RAGs because relational database watermarks require white-box access to detect IP infringement, which is unrealistic for the knowledge base in RAGs. Meanwhile, post-processing by the adversary's deployed LLMs typically destructs text watermark information. To address those problems, we propose a novel black-box "knowledge watermark" approach, named RAG-WM, to detect IP infringement of RAGs. RAG-WM uses a multi-LLM interaction framework, comprising a Watermark Generator, Shadow LLM & RAG, and Watermark Discriminator, to create watermark texts based on watermark entity-relationship tuples and inject them into the target RAG. We evaluate RAG-WM across three domain-specific and two privacy-sensitive tasks on four benchmark LLMs. Experimental results show that RAG-WM effectively detects the stolen RAGs in various deployed LLMs. Furthermore, RAG-WM is robust against paraphrasing, unrelated content removal, knowledge insertion, and knowledge expansion attacks. Lastly, RAG-WM can also evade watermark detection approaches, highlighting its promising application in detecting IP infringement of RAG systems.
SEMar 2, 2025
Towards Reliable LLM-Driven Fuzz Testing: Vision and Road AheadYiran Cheng, Hong Jin Kang, Lwin Khin Shar et al.
Fuzz testing is a crucial component of software security assessment, yet its effectiveness heavily relies on valid fuzz drivers and diverse seed inputs. Recent advancements in Large Language Models (LLMs) offer transformative potential for automating fuzz testing (LLM4Fuzz), particularly in generating drivers and seeds. However, current LLM4Fuzz solutions face critical reliability challenges, including low driver validity rates and seed quality trade-offs, hindering their practical adoption. This paper aims to examine the reliability bottlenecks of LLM-driven fuzzing and explores potential research directions to address these limitations. It begins with an overview of the current development of LLM4SE and emphasizes the necessity for developing reliable LLM4Fuzz solutions. Following this, the paper envisions a vision where reliable LLM4Fuzz transforms the landscape of software testing and security for industry, software development practitioners, and economic accessibility. It then outlines a road ahead for future research, identifying key challenges and offering specific suggestions for the researchers to consider. This work strives to spark innovation in the field, positioning reliable LLM4Fuzz as a fundamental component of modern software testing.
LGFeb 17, 2025
Dictionary-Learning-Based Data Pruning for System IdentificationTingna Wang, Sikai Zhang, Mingming Song et al.
System identification is normally involved in augmenting time series data by time shifting and nonlinearisation (e.g., polynomial basis), both of which introduce redundancy in features and samples. Many research works focus on reducing redundancy feature-wise, while less attention is paid to sample-wise redundancy. This paper proposes a novel data pruning method, called mini-batch FastCan, to reduce sample-wise redundancy based on dictionary learning. Time series data is represented by some representative samples, called atoms, via dictionary learning. The useful samples are selected based on their correlation with the atoms. The method is tested on one simulated dataset and two benchmark datasets. The R-squared between the coefficients of models trained on the full datasets and the coefficients of models trained on pruned datasets is adopted to evaluate the performance of data pruning methods. It is found that the proposed method significantly outperforms the random pruning method.
CRJun 4, 2024
HoneyGPT: Breaking the Trilemma in Terminal Honeypots with Large Language ModelZiyang Wang, Jianzhou You, Haining Wang et al.
Honeypots, as a strategic cyber-deception mechanism designed to emulate authentic interactions and bait unauthorized entities, often struggle with balancing flexibility, interaction depth, and deception. They typically fail to adapt to evolving attacker tactics, with limited engagement and information gathering. Fortunately, the emergent capabilities of large language models and innovative prompt-based engineering offer a transformative shift in honeypot technologies. This paper introduces HoneyGPT, a pioneering shell honeypot architecture based on ChatGPT, characterized by its cost-effectiveness and proactive engagement. In particular, we propose a structured prompt engineering framework that incorporates chain-of-thought tactics to improve long-term memory and robust security analytics, enhancing deception and engagement. Our evaluation of HoneyGPT comprises a baseline comparison based on a collected dataset and a three-month field evaluation. The baseline comparison demonstrates HoneyGPT's remarkable ability to strike a balance among flexibility, interaction depth, and deceptive capability. The field evaluation further validates HoneyGPT's superior performance in engaging attackers more deeply and capturing a wider array of novel attack vectors.
CLMay 15, 2023
Hierarchical Aligned Multimodal Learning for NER on Tweet PostsPeipei Liu, Hong Li, Yimo Ren et al.
Mining structured knowledge from tweets using named entity recognition (NER) can be beneficial for many down stream applications such as recommendation and intention understanding. With tweet posts tending to be multimodal, multimodal named entity recognition (MNER) has attracted more attention. In this paper, we propose a novel approach, which can dynamically align the image and text sequence and achieve the multi-level cross-modal learning to augment textual word representation for MNER improvement. To be specific, our framework can be split into three main stages: the first stage focuses on intra-modality representation learning to derive the implicit global and local knowledge of each modality, the second evaluates the relevance between the text and its accompanying image and integrates different grained visual information based on the relevance, the third enforces semantic refinement via iterative cross-modal interactions and co-attention. We conduct experiments on two open datasets, and the results and detailed analysis demonstrate the advantage of our model.
CRSep 24, 2021
Finding Taint-Style Vulnerabilities in Linux-based Embedded Firmware with SSE-based Alias AnalysisKai Cheng, Tao Liu, Le Guan et al.
Although the importance of using static analysis to detect taint-style vulnerabilities in Linux-based embedded firmware is widely recognized, existing approaches are plagued by three major limitations. (a) Approaches based on symbolic execution may miss alias information and therefore suffer from a high false-negative rate. (b) Approaches based on VSA (value set analysis) often provide an over-approximate pointer range. As a result, many false positives could be produced. (c) Existing work for detecting taint-style vulnerability does not consider indirect call resolution, whereas indirect calls are frequently used in Internet-facing embedded devices. As a result, many false negatives could be produced. In this work, we propose a precise demand-driven flow-, context- and field-sensitive alias analysis approach. Based on this new approach, we also design a novel indirect call resolution scheme. Combined with sanitization rule checking, our solution discovers taint-style vulnerabilities by static taint analysis. We implemented our idea with a prototype called EmTaint and evaluated it against 35 real-world embedded firmware samples from six popular vendors. EmTaint discovered at least 192 bugs, including 41 n-day bugs and 151 0-day bugs. At least 115 CVE/PSV numbers have been allocated from a subset of the reported vulnerabilities at the time of writing. Compared to state-of-the-art tools such as KARONTE and SaTC, EmTaint found significantly more bugs on the same dataset in less time.
CRAug 14, 2021
SEIGuard: An Authentication-simplified and Deceptive Scheme to Protect Server-side Social Engineering Information Against Brute-force AttacksZuoguang Wang, Jiaqian Peng, Hongsong Zhu et al.
This paper proposes an authentication-simplified and deceptive scheme (SEIGuard) to protect server-side social engineering information (SEI) and other information against brute-force attacks. In SEIGuard, the password check in authentication is omitted and this design is further combined with the SEI encryption design using honey encryption. The login password merely serves as a temporary key to encrypt SEI and there is no password plaintext or ciphertext stored in the database. During the login, the server doesn't check the login passwords, correct passwords decrypt ciphertexts to be correct plaintexts; incorrect passwords decrypt ciphertexts to be phony but plausible-looking plaintexts (sampled from the same distribution). And these two situations share the same undifferentiated backend procedures. This scheme eliminates the anchor that both online and offline brute-force attacks depending on. Furthermore, this paper presents four SEIGuard scheme designs and algorithms for 4 typical social engineering information objects (mobile phone number, identification number, email address, personal name), which represent 4 different types of message space, i.e. 1) limited and uniformly distributed, 2) limited, complex and uniformly distributed, 3) unlimited and uniformly distributed, 4) unlimited and non-uniformly distributed message space. Specially, we propose multiple small mapping files strategies, binary search algorithms, two-part HE (DTE) design and incremental mapping files solutions for the applications of SEIGuard scheme. Finally, this paper develops the SEIGuard system based on the proposed schemes, designs and algorithms. Experiment result shows that the SEIGuard scheme can effectively protect server-side SEI against brute-force attacks, and SEIGuard also has an impressive real-time response performance that is better than conventional PBE server scheme and HE encryption/decryption.
CRJul 21, 2021
Firmware Re-hosting Through Static Binary-level PortingMingfeng Xin, Hui Wen, Liting Deng et al.
The rapid growth of the Industrial Internet of Things (IIoT) has brought embedded systems into focus as major targets for both security analysts and malicious adversaries. Due to the non-standard hardware and diverse software, embedded devices present unique challenges to security analysts for the accurate analysis of firmware binaries. The diversity in hardware components and tight coupling between firmware and hardware makes it hard to perform dynamic analysis, which must have the ability to execute firmware code in virtualized environments. However, emulating the large expanse of hardware peripherals makes analysts have to frequently modify the emulator for executing various firmware code in different virtualized environments, greatly limiting the ability of security analysis. In this work, we explore the problem of firmware re-hosting related to the real-time operating system (RTOS). Specifically, developers create a Board Support Package (BSP) and develop device drivers to make that RTOS run on their platform. By providing high-level replacements for BSP routines and device drivers, we can make the minimal modification of the firmware that is to be migrated from its original hardware environment into a virtualized one. We show that an approach capable of offering the ability to execute firmware at scale through patching firmware in an automated manner without modifying the existing emulators. Our approach, called static binary-level porting, first identifies the BSP and device drivers in target firmware, then patches the firmware with pre-built BSP routines and drivers that can be adapted to the existing emulators. Finally, we demonstrate the practicality of the proposed method on multiple hardware platforms and firmware samples for security analysis. The result shows that the approach is flexible enough to emulate firmware for vulnerability assessment and exploits development.
MLJun 15, 2021
Canonical-Correlation-Based Fast Feature Selection for Structural Health MonitoringSikai Zhang, Tingna Wang, Keith Worden et al.
Feature selection refers to the process of selecting useful features for machine learning tasks, and it is also a key step for structural health monitoring (SHM). This paper proposes a fast feature selection algorithm by efficiently computing the sum of squared canonical correlation coefficients between monitored features and target variables of interest in greedy search. The proposed algorithm is applied to both synthetic and real datasets to illustrate its advantages in terms of computational speed, general classification and regression tasks, as well as damage-sensitive feature selection tasks. Furthermore, the performance of the proposed algorithm is evaluated under varying environmental conditions and on an edge computing device to investigate its applicability in real-world SHM scenarios. The results show that the proposed algorithm can successfully select useful features with extraordinarily fast computational speed, which implies that the proposed algorithm has great potential where features need to be selected and updated online frequently, or where devices have limited computing capability.
CLJun 1, 2021
Discontinuous Named Entity Recognition as Maximal Clique DiscoveryYucheng Wang, Bowen Yu, Hongsong Zhu et al.
Named entity recognition (NER) remains challenging when entity mentions can be discontinuous. Existing methods break the recognition process into several sequential steps. In training, they predict conditioned on the golden intermediate results, while at inference relying on the model output of the previous steps, which introduces exposure bias. To solve this problem, we first construct a segment graph for each sentence, in which each node denotes a segment (a continuous entity on its own, or a part of discontinuous entities), and an edge links two nodes that belong to the same entity. The nodes and edges can be generated respectively in one stage with a grid tagging scheme and learned jointly using a novel architecture named Mac. Then discontinuous NER can be reformulated as a non-parametric process of discovering maximal cliques in the graph and concatenating the spans in each clique. Experiments on three benchmarks show that our method outperforms the state-of-the-art (SOTA) results, with up to 3.5 percentage points improvement on F1, and achieves 5x speedup over the SOTA model.
CRMay 28, 2021
Social Engineering in Cybersecurity: A Domain Ontology and Knowledge Graph Application ExamplesZuoguang Wang, Hongsong Zhu, Peipei Liu et al.
Social engineering has posed a serious threat to cyberspace security. To protect against social engineering attacks, a fundamental work is to know what constitutes social engineering. This paper first develops a domain ontology of social engineering in cybersecurity and conducts ontology evaluation by its knowledge graph application. The domain ontology defines 11 concepts of core entities that significantly constitute or affect social engineering domain, together with 22 kinds of relations describing how these entities related to each other. It provides a formal and explicit knowledge schema to understand, analyze, reuse and share domain knowledge of social engineering. Furthermore, this paper builds a knowledge graph based on 15 social engineering attack incidents and scenarios. 7 knowledge graph application examples (in 6 analysis patterns) demonstrate that the ontology together with knowledge graph is useful to 1) understand and analyze social engineering attack scenario and incident, 2) find the top ranked social engineering threat elements (e.g. the most exploited human vulnerabilities and most used attack mediums), 3) find potential social engineering threats to victims, 4) find potential targets for social engineering attackers, 5) find potential attack paths from specific attacker to specific target, and 6) analyze the same origin attacks.
CLOct 26, 2020
TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair LinkingYucheng Wang, Bowen Yu, Yueyang Zhang et al.
Extracting entities and relations from unstructured text has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in identifying overlapping relations with shared entities. Prior works show that joint learning can result in a noticeable performance gain. However, they usually involve sequential interrelated steps and suffer from the problem of exposure bias. At training time, they predict with the ground truth conditions while at inference it has to make extraction from scratch. This discrepancy leads to error accumulation. To mitigate the issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while immune from the exposure bias. TPLinker formulates joint extraction as a token pair linking problem and introduces a novel handshaking tagging scheme that aligns the boundary tokens of entity pairs under each relation type. Experiment results show that TPLinker performs significantly better on overlapping and multiple relation extraction, and achieves state-of-the-art performance on two public datasets.
CRDec 31, 2019
Logic Bugs in IoT Platforms and Systems: A ReviewWei Zhou, Chen Cao, Dongdong Huo et al.
In recent years, IoT platforms and systems have been rapidly emerging. Although IoT is a new technology, new does not mean simpler (than existing networked systems). Contrarily, the complexity (of IoT platforms and systems) is actually being increased in terms of the interactions between the physical world and cyberspace. The increased complexity indeed results in new vulnerabilities. This paper seeks to provide a review of the recently discovered logic bugs that are specific to IoT platforms and systems. In particular, 17 logic bugs and one weakness falling into seven categories of vulnerabilities are reviewed in this survey.