AIAug 23, 2024
Trustworthy, Responsible, and Safe AI: A Comprehensive Architectural Framework for AI Safety with Challenges and MitigationsChen Chen, Xueluan Gong, Ziyao Liu et al.
AI Safety is an emerging area of critical importance to the safe adoption and deployment of AI systems. With the rapid proliferation of AI and especially with the recent advancement of Generative AI (or GAI), the technology ecosystem behind the design, development, adoption, and deployment of AI systems has drastically changed, broadening the scope of AI Safety to address impacts on public safety and national security. In this paper, we propose a novel architectural framework for understanding and analyzing AI Safety; defining its characteristics from three perspectives: Trustworthy AI, Responsible AI, and Safe AI. We provide an extensive review of current research and advancements in AI safety from these perspectives, highlighting their key challenges and mitigation approaches. Through examples from state-of-the-art technologies, particularly Large Language Models (LLMs), we present innovative mechanism, methodologies, and techniques for designing and testing AI safety. Our goal is to promote advancement in AI safety research, and ultimately enhance people's trust in digital transformation.
LGDec 15, 2025
FROC: A Unified Framework with Risk-Optimized Control for Machine Unlearning in LLMsSi Qi Goh, Yongsen Zheng, Ziyao Liu et al.
Machine unlearning (MU) seeks to eliminate the influence of specific training examples from deployed models. As large language models (LLMs) become widely used, managing risks arising from insufficient forgetting or utility loss is increasingly crucial. Current MU techniques lack effective mechanisms for evaluating and controlling these risks, hindering the selection of strategies that appropriately balance safety and utility, and raising trust concerns surrounding the "right to be forgotten." To address these issues, we propose FROC, a unified framework with Risk-Optimized Control for machine unlearning in LLMs. FROC is built around a conformal-style risk-control formulation that expresses a user-specified risk budget on unlearning behavior. This probability-based constraint enables FROC to compare MU strategies, identify feasible operating regions, and guide hyperparameter selection according to desired trade-offs between forgetting sufficiency and utility preservation. To operationalize this constraint, FROC introduces a smoothly varying continuous risk model that aggregates forgetting deficiency and utility degradation into a single configuration-level score. Building on conformal risk analysis, FROC computes (1) the Conformal Unlearning Risk (CUR), a data-driven estimated value on the probability that forgotten samples continue to influence model predictions, and (2) risk-controlled configuration sets, which identify unlearning hyperparameters that are valid under the specified risk budget. Experiments across multiple LLM MU methods demonstrate that FROC produces stable, interpretable risk landscapes and reveals consistent relationships between unlearning configurations, semantic shift, and utility impact. FROC reframes MU as a controllable, risk-aware process and offers a practical foundation for managing unlearning behavior in large-scale LLM deployments.
CRDec 29, 2025
Certifying the Right to Be Forgotten: Primal-Dual Optimization for Sample and Label Unlearning in Vertical Federated LearningYu Jiang, Xindi Tong, Ziyao Liu et al.
Federated unlearning has become an attractive approach to address privacy concerns in collaborative machine learning, for situations when sensitive data is remembered by AI models during the machine learning process. It enables the removal of specific data influences from trained models, aligning with the growing emphasis on the "right to be forgotten." While extensively studied in horizontal federated learning, unlearning in vertical federated learning (VFL) remains challenging due to the distributed feature architecture. VFL unlearning includes sample unlearning that removes specific data points' influence and label unlearning that removes entire classes. Since different parties hold complementary features of the same samples, unlearning tasks require cross-party coordination, creating computational overhead and complexities from feature interdependencies. To address such challenges, we propose FedORA (Federated Optimization for data Removal via primal-dual Algorithm), designed for sample and label unlearning in VFL. FedORA formulates the removal of certain samples or labels as a constrained optimization problem solved using a primal-dual framework. Our approach introduces a new unlearning loss function that promotes classification uncertainty rather than misclassification. An adaptive step size enhances stability, while an asymmetric batch design, considering the prior influence of the remaining data on the model, handles unlearning and retained data differently to efficiently reduce computational costs. We provide theoretical analysis proving that the model difference between FedORA and Train-from-scratch is bounded, establishing guarantees for unlearning effectiveness. Experiments on tabular and image datasets demonstrate that FedORA achieves unlearning effectiveness and utility preservation comparable to Train-from-scratch with reduced computation and communication overhead.
CLJan 3, 2025Code
An Investigation into Value Misalignment in LLM-Generated Texts for Cultural HeritageFan Bu, Zheng Wang, Siyi Wang et al.
As Large Language Models (LLMs) become increasingly prevalent in tasks related to cultural heritage, such as generating descriptions of historical monuments, translating ancient texts, preserving oral traditions, and creating educational content, their ability to produce accurate and culturally aligned texts is being increasingly relied upon by users and researchers. However, cultural value misalignments may exist in generated texts, such as the misrepresentation of historical facts, the erosion of cultural identity, and the oversimplification of complex cultural narratives, which may lead to severe consequences. Therefore, investigating value misalignment in the context of LLM for cultural heritage is crucial for mitigating these risks, yet there has been a significant lack of systematic and comprehensive study and investigation in this area. To fill this gap, we systematically assess the reliability of LLMs in generating culturally aligned texts for cultural heritage-related tasks. We conduct a comprehensive evaluation by compiling an extensive set of 1066 query tasks covering 5 widely recognized categories with 17 aspects within the knowledge framework of cultural heritage across 5 open-source LLMs, and examine both the type and rate of cultural value misalignments in the generated texts. Using both automated and manual approaches, we effectively detect and analyze the cultural value misalignments in LLM-generated texts. Our findings are concerning: over 65% of the generated texts exhibit notable cultural misalignments, with certain tasks demonstrating almost complete misalignment with key cultural values. Beyond these findings, this paper introduces a benchmark dataset and a comprehensive evaluation workflow that can serve as a valuable resource for future research aimed at enhancing the cultural sensitivity and reliability of LLMs.
IRJul 1, 2025Code
Why Multi-Interest Fairness Matters: Hypergraph Contrastive Multi-Interest Learning for Fair Conversational Recommender SystemYongsen Zheng, Zongxuan Xie, Guohua Wang et al.
Unfairness is a well-known challenge in Recommender Systems (RSs), often resulting in biased outcomes that disadvantage users or items based on attributes such as gender, race, age, or popularity. Although some approaches have started to improve fairness recommendation in offline or static contexts, the issue of unfairness often exacerbates over time, leading to significant problems like the Matthew effect, filter bubbles, and echo chambers. To address these challenges, we proposed a novel framework, Hypergraph Contrastive Multi-Interest Learning for Fair Conversational Recommender System (HyFairCRS), aiming to promote multi-interest diversity fairness in dynamic and interactive Conversational Recommender Systems (CRSs). HyFairCRS first captures a wide range of user interests by establishing diverse hypergraphs through contrastive learning. These interests are then utilized in conversations to generate informative responses and ensure fair item predictions within the dynamic user-system feedback loop. Experiments on two CRS-based datasets show that HyFairCRS achieves a new state-of-the-art performance while effectively alleviating unfairness. Our code is available at https://github.com/zysensmile/HyFairCRS.
CRMar 20, 2024
Threats, Attacks, and Defenses in Machine Unlearning: A SurveyZiyao Liu, Huanyi Ye, Chen Chen et al.
Machine Unlearning (MU) has recently gained considerable attention due to its potential to achieve Safe AI by removing the influence of specific data from trained Machine Learning (ML) models. This process, known as knowledge removal, addresses AI governance concerns of training data such as quality, sensitivity, copyright restrictions, and obsolescence. This capability is also crucial for ensuring compliance with privacy regulations such as the Right To Be Forgotten (RTBF). Furthermore, effective knowledge removal mitigates the risk of harmful outcomes, safeguarding against biases, misinformation, and unauthorized data exploitation, thereby enhancing the safe and responsible use of AI systems. Efforts have been made to design efficient unlearning approaches, with MU services being examined for integration with existing machine learning as a service (MLaaS), allowing users to submit requests to remove specific data from the training corpus. However, recent research highlights vulnerabilities in machine unlearning systems, such as information leakage and malicious unlearning, that can lead to significant security and privacy concerns. Moreover, extensive research indicates that unlearning methods and prevalent attacks fulfill diverse roles within MU systems. This underscores the intricate relationship and complex interplay among these mechanisms in maintaining system functionality and safety. This survey aims to fill the gap between the extensive number of studies on threats, attacks, and defenses in machine unlearning and the absence of a comprehensive review that categorizes their taxonomy, methods, and solutions, thus offering valuable insights for future research directions and practical implementations.
CRApr 8, 2024
Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model SecurityYihe Fan, Yuxin Cao, Ziyu Zhao et al.
Multimodal Large Language Models (MLLMs) demonstrate remarkable capabilities that increasingly influence various aspects of our daily lives, constantly defining the new boundary of Artificial General Intelligence (AGI). Image modalities, enriched with profound semantic information and a more continuous mathematical nature compared to other modalities, greatly enhance the functionalities of MLLMs when integrated. However, this integration serves as a double-edged sword, providing attackers with expansive vulnerabilities to exploit for highly covert and harmful attacks. The pursuit of reliable AI systems like powerful MLLMs has emerged as a pivotal area of contemporary research. In this paper, we endeavor to demostrate the multifaceted risks associated with the incorporation of image modalities into MLLMs. Initially, we delineate the foundational components and training processes of MLLMs. Subsequently, we construct a threat model, outlining the security vulnerabilities intrinsic to MLLMs. Moreover, we analyze and summarize existing scholarly discourses on MLLMs' attack and defense mechanisms, culminating in suggestions for the future research on MLLM security. Through this comprehensive analysis, we aim to deepen the academic understanding of MLLM security challenges and propel forward the development of trustworthy MLLM systems.
AIMar 29, 2024
A Learning-based Incentive Mechanism for Mobile AIGC Service in Decentralized Internet of VehiclesJiani Fan, Minrui Xu, Ziyao Liu et al.
Artificial Intelligence-Generated Content (AIGC) refers to the paradigm of automated content generation utilizing AI models. Mobile AIGC services in the Internet of Vehicles (IoV) network have numerous advantages over traditional cloud-based AIGC services, including enhanced network efficiency, better reconfigurability, and stronger data security and privacy. Nonetheless, AIGC service provisioning frequently demands significant resources. Consequently, resource-constrained roadside units (RSUs) face challenges in maintaining a heterogeneous pool of AIGC services and addressing all user service requests without degrading overall performance. Therefore, in this paper, we propose a decentralized incentive mechanism for mobile AIGC service allocation, employing multi-agent deep reinforcement learning to find the balance between the supply of AIGC services on RSUs and user demand for services within the IoV context, optimizing user experience and minimizing transmission latency. Experimental results demonstrate that our approach achieves superior performance compared to other baseline models.
CRNov 17, 2024
Efficient Federated Unlearning with Adaptive Differential Privacy PreservationYu Jiang, Xindi Tong, Ziyao Liu et al.
Federated unlearning (FU) offers a promising solution to effectively address the need to erase the impact of specific clients' data on the global model in federated learning (FL), thereby granting individuals the ``Right to be Forgotten". The most straightforward approach to achieve unlearning is to train the model from scratch, excluding clients who request data removal, but it is resource-intensive. Current state-of-the-art FU methods extend traditional FL frameworks by leveraging stored historical updates, enabling more efficient unlearning than training from scratch. However, the use of stored updates introduces significant privacy risks. Adversaries with access to these updates can potentially reconstruct clients' local data, a well-known vulnerability in the privacy domain. While privacy-enhanced techniques exist, their applications to FU scenarios that balance unlearning efficiency with privacy protection remain underexplored. To address this gap, we propose FedADP, a method designed to achieve both efficiency and privacy preservation in FU. Our approach incorporates an adaptive differential privacy (DP) mechanism, carefully balancing privacy and unlearning performance through a novel budget allocation strategy tailored for FU. FedADP also employs a dual-layered selection process, focusing on global models with significant changes and client updates closely aligned with the global model, reducing storage and communication costs. Additionally, a novel calibration method is introduced to facilitate effective unlearning. Extensive experimental results demonstrate that FedADP effectively manages the trade-off between unlearning efficiency and privacy protection.
SDMay 15, 2024
Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style TransferWeifei Jin, Yuxin Cao, Junjie Su et al.
In light of the widespread application of Automatic Speech Recognition (ASR) systems, their security concerns have received much more attention than ever before, primarily due to the susceptibility of Deep Neural Networks. Previous studies have illustrated that surreptitiously crafting adversarial perturbations enables the manipulation of speech recognition systems, resulting in the production of malicious commands. These attack methods mostly require adding noise perturbations under $\ell_p$ norm constraints, inevitably leaving behind artifacts of manual modifications. Recent research has alleviated this limitation by manipulating style vectors to synthesize adversarial examples based on Text-to-Speech (TTS) synthesis audio. However, style modifications based on optimization objectives significantly reduce the controllability and editability of audio styles. In this paper, we propose an attack on ASR systems based on user-customized style transfer. We first test the effect of Style Transfer Attack (STA) which combines style transfer and adversarial attack in sequential order. And then, as an improvement, we propose an iterative Style Code Attack (SCA) to maintain audio quality. Experimental results show that our method can meet the need for user-customized styles and achieve a success rate of 82% in attacks, while keeping sound naturalness due to our user study.
CVMar 31, 2024
Object-level Copy-Move Forgery Image Detection based on Inconsistency MiningJingyu Wang, Niantai Jing, Ziyao Liu et al.
In copy-move tampering operations, perpetrators often employ techniques, such as blurring, to conceal tampering traces, posing significant challenges to the detection of object-level targets with intact structures. Focus on these challenges, this paper proposes an Object-level Copy-Move Forgery Image Detection based on Inconsistency Mining (IMNet). To obtain complete object-level targets, we customize prototypes for both the source and tampered regions and dynamically update them. Additionally, we extract inconsistent regions between coarse similar regions obtained through self-correlation calculations and regions composed of prototypes. The detected inconsistent regions are used as supplements to coarse similar regions to refine pixel-level detection. We operate experiments on three public datasets which validate the effectiveness and the robustness of the proposed IMNet.
LGMar 13, 2025
Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and DirectionsJiani Fan, Lwin Khin Shar, Ruichen Zhang et al.
Money laundering is a financial crime that obscures the origin of illicit funds, necessitating the development and enforcement of anti-money laundering (AML) policies by governments and organizations. The proliferation of mobile payment platforms and smart IoT devices has significantly complicated AML investigations. As payment networks become more interconnected, there is an increasing need for efficient real-time detection to process large volumes of transaction data on heterogeneous payment systems by different operators such as digital currencies, cryptocurrencies and account-based payments. Most of these mobile payment networks are supported by connected devices, many of which are considered loT devices in the FinTech space that constantly generate data. Furthermore, the growing complexity and unpredictability of transaction patterns across these networks contribute to a higher incidence of false positives. While machine learning solutions have the potential to enhance detection efficiency, their application in AML faces unique challenges, such as addressing privacy concerns tied to sensitive financial data and managing the real-world constraint of limited data availability due to data regulations. Existing surveys in the AML literature broadly review machine learning approaches for money laundering detection, but they often lack an in-depth exploration of advanced deep learning techniques - an emerging field with significant potential. To address this gap, this paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML. Additionally, we propose a novel framework that applies the least-privilege principle by integrating machine learning techniques, codifying AML red flags, and employing account profiling to provide context for predictions and enable effective fraud detection under limited data availability....
CVMar 8, 2024
3D Face Reconstruction Using A Spectral-Based Graph Convolution EncoderHaoxin Xu, Zezheng Zhao, Yuxin Cao et al.
Monocular 3D face reconstruction plays a crucial role in avatar generation, with significant demand in web-related applications such as generating virtual financial advisors in FinTech. Current reconstruction methods predominantly rely on deep learning techniques and employ 2D self-supervision as a means to guide model learning. However, these methods encounter challenges in capturing the comprehensive 3D structural information of the face due to the utilization of 2D images for model training purposes. To overcome this limitation and enhance the reconstruction of 3D structural features, we propose an innovative approach that integrates existing 2D features with 3D features to guide the model learning process. Specifically, we introduce the 3D-ID Loss, which leverages the high-dimensional structure features extracted from a Spectral-Based Graph Convolution Encoder applied to the facial mesh. This approach surpasses the sole reliance on the 3D information provided by the facial mesh vertices coordinates. Our model is trained using 2D-3D data pairs from a combination of datasets and achieves state-of-the-art performance on the NoW benchmark.
CRJan 16, 2024
Towards Efficient and Certified Recovery from Poisoning Attacks in Federated LearningYu Jiang, Jiyuan Shen, Ziyao Liu et al.
Federated learning (FL) is vulnerable to poisoning attacks, where malicious clients manipulate their updates to affect the global model. Although various methods exist for detecting those clients in FL, identifying malicious clients requires sufficient model updates, and hence by the time malicious clients are detected, FL models have been already poisoned. Thus, a method is needed to recover an accurate global model after malicious clients are identified. Current recovery methods rely on (i) all historical information from participating FL clients and (ii) the initial model unaffected by the malicious clients, leading to a high demand for storage and computational resources. In this paper, we show that highly effective recovery can still be achieved based on (i) selective historical information rather than all historical information and (ii) a historical model that has not been significantly affected by malicious clients rather than the initial model. In this scenario, while maintaining comparable recovery performance, we can accelerate the recovery speed and decrease memory consumption. Following this concept, we introduce Crab, an efficient and certified recovery method, which relies on selective information storage and adaptive model rollback. Theoretically, we demonstrate that the difference between the global model recovered by Crab and the one recovered by train-from-scratch can be bounded under certain assumptions. Our empirical evaluation, conducted across three datasets over multiple machine learning models, and a variety of untargeted and targeted poisoning attacks reveals that Crab is both accurate and efficient, and consistently outperforms previous approaches in terms of both recovery speed and memory consumption.
CRFeb 10, 2022
Understanding Security in Smart City Domains From the ANT-centric PerspectiveJiani Fan, Wenzhuo Yang, Ziyao Liu et al.
A city is a large human settlement that serves the people who live there, and a smart city is a concept of how cities might better serve their residents through new forms of technology. In this paper, we focus on four major smart city domains according to Maslow's hierarchy of needs: smart utility, smart transportation, smart homes, and smart healthcare. Numerous IoT applications have been developed to achieve the intelligence that we desire in our smart domains, ranging from personal gadgets such as health trackers and smart watches to large-scale industrial IoT systems such as nuclear and energy management systems. However, many of the existing smart city IoT solutions can be made better by considering the suitability of their security strategies. Inappropriate system security designs generally occur in two scenarios: first, system designers recognize the importance of security but are unsure of where, when, or how to implement it; and second, system designers try to fit traditional security designs to meet the smart city security context. Thus, the objective of this paper is to provide application designers with the missing security link they may need to improve their security designs. By evaluating the specific context of each smart city domain and the context-specific security requirements, we aim to provide directions on when, where, and how they should implement security strategies and the possible security challenges they need to consider. In addition, we present a new perspective on security issues in smart cities from a data-centric viewpoint by referring to the reference architecture, the Activity-Network-Things (ANT)-centric architecture, built upon the concept of "security in a zero-trust environment". By doing so, we reduce the security risks posed by new system interactions or unanticipated user behaviors while avoiding the hassle of regularly upgrading security models.
CROct 17, 2020
Secure Weighted Aggregation for Federated LearningJiale Guo, Ziyao Liu, Kwok-Yan Lam et al.
The pervasive adoption of Internet-connected digital services has led to a growing concern in the personal data privacy of their customers. On the other hand, machine learning (ML) techniques have been widely adopted by digital service providers to improve operational productivity and customer satisfaction. ML inevitably accesses and processes users' personal data, which could potentially breach the relevant privacy protection regulations if not performed carefully. The situation is exacerbated by the cloud-based implementation of digital services when user data are captured and stored in distributed locations, hence aggregation of the user data for ML could be a serious breach of privacy regulations. In this backdrop, Federated Learning (FL) is an emerging area that allows ML on distributed data without the data leaving their stored location. However, depending on the nature of the digital services, data captured at different locations may carry different significance to the business operation, hence a weighted aggregation will be highly desirable for enhancing the quality of the FL-learned model. Furthermore, to prevent leakage of user data from the aggregated gradients, cryptographic mechanisms are needed to allow secure aggregation of FL. In this paper, we propose a privacy-enhanced FL scheme for supporting secure weighted aggregation. Besides, by devising a verification protocol based on Zero-Knowledge Proof (ZKP), the proposed scheme is capable of guarding against fraudulent messages from FL participants. Experimental results show that our scheme is practical and secure. Compared to existing FL approaches, our scheme achieves secure weighted aggregation with an additional security guarantee against fraudulent messages with an affordable 1.2 times runtime overheads and 1.3 times communication costs.
CRJul 24, 2020
MPC-enabled Privacy-Preserving Neural Network Training against Malicious AttackZiyao Liu, Ivan Tjuawinata, Chaoping Xing et al.
The application of secure multiparty computation (MPC) in machine learning, especially privacy-preserving neural network training, has attracted tremendous attention from the research community in recent years. MPC enables several data owners to jointly train a neural network while preserving the data privacy of each participant. However, most of the previous works focus on semi-honest threat model that cannot withstand fraudulent messages sent by malicious participants. In this paper, we propose an approach for constructing efficient $n$-party protocols for secure neural network training that can provide security for all honest participants even when a majority of the parties are malicious. Compared to the other designs that provide semi-honest security in a dishonest majority setting, our actively secure neural network training incurs affordable efficiency overheads of around 2X and 2.7X in LAN and WAN settings, respectively. Besides, we propose a scheme to allow additive shares defined over an integer ring $\mathbb{Z}_N$ to be securely converted to additive shares over a finite field $\mathbb{Z}_Q$, which may be of independent interest. Such conversion scheme is essential in securely and correctly converting shared Beaver triples defined over an integer ring generated in the preprocessing phase to triples defined over a field to be used in the calculation in the online phase.