Kejiang Ye

CR
h-index19
17papers
445citations
Novelty46%
AI Score57

17 Papers

LGMar 27, 2023Code
Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Tianrui Qin, Xitong Gao, Juanjuan Zhao et al.

Unlearnable example attacks are data poisoning techniques that can be used to safeguard public data against unauthorized use for training deep learning models. These methods add stealthy perturbations to the original image, thereby making it difficult for deep learning models to learn from these training data effectively. Current research suggests that adversarial training can, to a certain degree, mitigate the impact of unlearnable example attacks, while common data augmentation methods are not effective against such poisons. Adversarial training, however, demands considerable computational resources and can result in non-trivial accuracy loss. In this paper, we introduce the UEraser method, which outperforms current defenses against different types of state-of-the-art unlearnable example attacks through a combination of effective data augmentation policies and loss-maximizing adversarial augmentations. In stark contrast to the current SOTA adversarial training methods, UEraser uses adversarial augmentations, which extends beyond the confines of $ \ell_p $ perturbation budget assumed by current unlearning attacks and defenses. It also helps to improve the model's generalization ability, thus protecting against accuracy loss. UEraser wipes out the unlearning effect with error-maximizing data augmentations, thus restoring trained model accuracies. Interestingly, UEraser-Lite, a fast variant without adversarial augmentations, is also highly effective in preserving clean accuracies. On challenging unlearnable CIFAR-10, CIFAR-100, SVHN, and ImageNet-subset datasets produced with various attacks, it achieves results that are comparable to those obtained during clean training. We also demonstrate its efficacy against possible adaptive attacks. Our code is open source and available to the deep learning community: https://github.com/lafeat/ueraser.

CVAug 7, 2023Code
APBench: A Unified Benchmark for Availability Poisoning Attacks and Defenses

Tianrui Qin, Xitong Gao, Juanjuan Zhao et al.

The efficacy of availability poisoning, a method of poisoning data by injecting imperceptible perturbations to prevent its use in model training, has been a hot subject of investigation. Previous research suggested that it was difficult to effectively counteract such poisoning attacks. However, the introduction of various defense methods has challenged this notion. Due to the rapid progress in this field, the performance of different novel methods cannot be accurately validated due to variations in experimental setups. To further evaluate the attack and defense capabilities of these poisoning methods, we have developed a benchmark -- APBench for assessing the efficacy of adversarial poisoning. APBench consists of 9 state-of-the-art availability poisoning attacks, 8 defense algorithms, and 4 conventional data augmentation techniques. We also have set up experiments with varying different poisoning ratios, and evaluated the attacks on multiple datasets and their transferability across model architectures. We further conducted a comprehensive evaluation of 2 additional attacks specifically targeting unsupervised models. Our results reveal the glaring inadequacy of existing attacks in safeguarding individual privacy. APBench is open source and available to the deep learning community: https://github.com/lafeat/apbench.

ITAug 16, 2022
Multi-Point Integrated Sensing and Communication: Fusion Model and Functionality Selection

Guoliang Li, Shuai Wang, Kejiang Ye et al.

Integrated sensing and communication (ISAC) represents a paradigm shift, where previously competing wireless transmissions are jointly designed to operate in harmony via the shared use of the hardware platform for improving the spectral and energy efficiencies. However, due to adversarial factors such as fading and interference, ISAC may suffer from high sensing uncertainties. This paper presents a multi-point ISAC (MPISAC) system that fuses the outputs from multiple ISAC devices for achieving higher sensing performance by exploiting multi-view data redundancy. Furthermore, we propose to effectively explore the performance trade-off between sensing and communication via a functionality selection module that adaptively determines the working state (i.e., sensing or communication) of an ISAC device. The crux of our approach is to derive a fusion model that predicts the fusion accuracy via hypothesis testing and optimal voting analysis. Simulation results demonstrate the superiority of MPISAC over various benchmark schemes and show that the proposed approach can effectively span the trade-off region in ISAC systems.

CRDec 20, 2022Code
Flareon: Stealthy any2any Backdoor Injection via Poisoned Augmentation

Tianrui Qin, Xianghuan He, Xitong Gao et al.

Open software supply chain attacks, once successful, can exact heavy costs in mission-critical applications. As open-source ecosystems for deep learning flourish and become increasingly universal, they present attackers previously unexplored avenues to code-inject malicious backdoors in deep neural network models. This paper proposes Flareon, a small, stealthy, seemingly harmless code modification that specifically targets the data augmentation pipeline with motion-based triggers. Flareon neither alters ground-truth labels, nor modifies the training loss objective, nor does it assume prior knowledge of the victim model architecture, training data, and training hyperparameters. Yet, it has a surprisingly large ramification on training -- models trained under Flareon learn powerful target-conditional (or "any2any") backdoors. The resulting models can exhibit high attack success rates for any target choices and better clean accuracies than backdoor attacks that not only seize greater control, but also assume more restrictive attack capabilities. We also demonstrate the effectiveness of Flareon against recent defenses. Flareon is fully open-source and available online to the deep learning community: https://github.com/lafeat/flareon.

CROct 28, 2022
Joint Semantic Transfer Network for IoT Intrusion Detection

Jiashu Wu, Yang Wang, Binhui Xie et al.

In this paper, we propose a Joint Semantic Transfer Network (JSTN) towards effective intrusion detection for large-scale scarcely labelled IoT domain. As a multi-source heterogeneous domain adaptation (MS-HDA) method, the JSTN integrates a knowledge rich network intrusion (NI) domain and another small-scale IoT intrusion (II) domain as source domains, and preserves intrinsic semantic properties to assist target II domain intrusion detection. The JSTN jointly transfers the following three semantics to learn a domain-invariant and discriminative feature representation. The scenario semantic endows source NI and II domain with characteristics from each other to ease the knowledge transfer process via a confused domain discriminator and categorical distribution knowledge preservation. It also reduces the source-target discrepancy to make the shared feature space domain-invariant. Meanwhile, the weighted implicit semantic transfer boosts discriminability via a fine-grained knowledge preservation, which transfers the source categorical distribution to the target domain. The source-target divergence guides the importance weighting during knowledge preservation to reflect the degree of knowledge learning. Additionally, the hierarchical explicit semantic alignment performs centroid-level and representative-level alignment with the help of a geometric similarity-aware pseudo-label refiner, which exploits the value of unlabelled target II domain and explicitly aligns feature representations from a global and local perspective in a concentrated manner. Comprehensive experiments on various tasks verify the superiority of the JSTN against state-of-the-art comparing methods, on average a 10.3% of accuracy boost is achieved. The statistical soundness of each constituting component and the computational efficiency are also verified.

DCApr 19
Cloud-native and Distributed Systems for Efficient and Scalable Large Language Models -- A Research Agenda

Minxian Xu, Jingfeng Wu, Shengye Song et al.

The rapid rise of Large Language Models (LLMs) has revolutionized various artificial intelligence (AI) applications, from natural language processing to code generation. However, the computational demands of these models, particularly in training and inference, present significant challenges. Traditional systems are often unable to meet these requirements, necessitating the integration of cloud-native and distributed architectures. This paper explores the role of cloud platforms and distributed systems in supporting the scalability, efficiency, and optimization of LLMs. We discuss the complexities of LLM deployment, including data management, resource optimization, and the need for microservices, autoscaling, and hybrid cloud-edge solutions. Additionally, we examine emerging research trends, such as serverless inference, quantum computing, and federated learning, and their potential to drive the next phase of LLM innovation. The paper concludes with a roadmap for future developments, emphasizing the need for continued research, standardization, and cross-sector collaboration to sustain the growth of LLMs in both research and enterprise applications.

CRJan 24, 2023
Heterogeneous Domain Adaptation for IoT Intrusion Detection: A Geometric Graph Alignment Approach

Jiashu Wu, Hao Dai, Yang Wang et al.

Data scarcity hinders the usability of data-dependent algorithms when tackling IoT intrusion detection (IID). To address this, we utilise the data rich network intrusion detection (NID) domain to facilitate more accurate intrusion detection for IID domains. In this paper, a Geometric Graph Alignment (GGA) approach is leveraged to mask the geometric heterogeneities between domains for better intrusion knowledge transfer. Specifically, each intrusion domain is formulated as a graph where vertices and edges represent intrusion categories and category-wise interrelationships, respectively. The overall shape is preserved via a confused discriminator incapable to identify adjacency matrices between different intrusion domain graphs. A rotation avoidance mechanism and a centre point matching mechanism is used to avoid graph misalignment due to rotation and symmetry, respectively. Besides, category-wise semantic knowledge is transferred to act as vertex-level alignment. To exploit the target data, a pseudo-label election mechanism that jointly considers network prediction, geometric property and neighbourhood information is used to produce fine-grained pseudo-label assignment. Upon aligning the intrusion graphs geometrically from different granularities, the transferred intrusion knowledge can boost IID performance. Comprehensive experiments on several intrusion datasets demonstrate state-of-the-art performance of the GGA approach and validate the usefulness of GGA constituting components.

DCMay 9
FlexPipe: Adapting Dynamic LLM Serving Through Inflight Pipeline Refactoring in Fragmented Serverless Clusters

Yanying Lin, Shijie Peng, Chengzhi Lu et al.

Serving Large Language Models (LLMs) in production faces significant challenges from highly variable request patterns and severe resource fragmentation in serverless clusters. Current systems rely on static pipeline configurations that struggle to adapt to dynamic workload conditions, leading to substantial inefficiencies. We present FlexPipe, a novel system that dynamically reconfigures pipeline architectures during runtime to address these fundamental limitations. FlexPipe decomposes models into fine-grained stages and intelligently adjusts pipeline granularity based on real-time request pattern analysis, implementing three key innovations: fine-grained model partitioning with preserved computational graph constraints, inflight pipeline refactoring with consistent cache transitions, and topology-aware resource allocation that navigates GPU fragmentation. Comprehensive evaluation on an 82-GPU cluster demonstrates that FlexPipe achieves up to 8.5x better resource efficiency while maintaining 38.3% lower latency compared to state-of-the-art systems, reducing GPU reservation requirements from 75% to 30% of peak capacity.

ROApr 20
Memory Centric Power Allocation for Multi-Agent Embodied Question Answering

Chengyang Li, Shuai Wang, Kejiang Ye et al.

This paper considers multi-agent embodied question answering (MA-EQA), which aims to query robot teams on what they have seen over a long horizon. In contrast to existing edge resource management methods that emphasize sensing, communication, or computation performance metrics, MA-EQA emphasizes the memory qualities. To cope with this paradigm shift, we propose a quality of memory (QoM) model based on generative adversarial exam (GAE), which leverages forward simulation to assess memory retrieval and uses the resulting exam scores to compute QoM values. Then we propose memory centric power allocation (MCPA), which maximizes the QoM function under communication resource constraints. Through asymptotic analysis, it is found that the transmit powers are proportional to the GAE error probability, thus prioritizing towards high-QoM robots. Extensive experiments demonstrate that MCPA achieves significant improvements over extensive benchmarks in terms of diverse metrics in various scenarios.

CVJul 25, 2025Code
BridgeNet: A Unified Multimodal Framework for Bridging 2D and 3D Industrial Anomaly Detection

An Xiang, Zixuan Huang, Xitong Gao et al.

Industrial anomaly detection for 2D objects has gained significant attention and achieved progress in anomaly detection (AD) methods. However, identifying 3D depth anomalies using only 2D information is insufficient. Despite explicitly fusing depth information into RGB images or using point cloud backbone networks to extract depth features, both approaches struggle to adequately represent 3D information in multimodal scenarios due to the disparities among different modal information. Additionally, due to the scarcity of abnormal samples in industrial data, especially in multimodal scenarios, it is necessary to perform anomaly generation to simulate real-world abnormal samples. Therefore, we propose a novel unified multimodal anomaly detection framework to address these issues. Our contributions consist of 3 key aspects. (1) We extract visible depth information from 3D point cloud data simply and use 2D RGB images to represent appearance, which disentangles depth and appearance to support unified anomaly generation. (2) Benefiting from the flexible input representation, the proposed Multi-Scale Gaussian Anomaly Generator and Unified Texture Anomaly Generator can generate richer anomalies in RGB and depth. (3) All modules share parameters for both RGB and depth data, effectively bridging 2D and 3D anomaly detection. Subsequent modules can directly leverage features from both modalities without complex fusion. Experiments show our method outperforms state-of-the-art (SOTA) on MVTec-3D AD and Eyecandies datasets. Code available at: https://github.com/Xantastic/BridgeNet

SPMay 24, 2020Code
How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey

Jiexia Ye, Juanjuan Zhao, Kejiang Ye et al.

In recent years, various deep learning architectures have been proposed to solve complex challenges (e.g. spatial dependency, temporal dependency) in traffic domain, which have achieved satisfactory performance. These architectures are composed of multiple deep learning techniques in order to tackle various challenges in traffic tasks. Traditionally, convolution neural networks (CNNs) are utilized to model spatial dependency by decomposing the traffic network as grids. However, many traffic networks are graph-structured in nature. In order to utilize such spatial information fully, it's more appropriate to formulate traffic networks as graphs mathematically. Recently, various novel deep learning techniques have been developed to process graph data, called graph neural networks (GNNs). More and more works combine GNNs with other deep learning techniques to construct an architecture dealing with various challenges in a complex traffic task, where GNNs are responsible for extracting spatial correlations in traffic network. These graph-based architectures have achieved state-of-the-art performance. To provide a comprehensive and clear picture of such emerging trend, this survey carefully examines various graph-based deep learning architectures in many traffic applications. We first give guidelines to formulate a traffic problem based on graph and construct graphs from various kinds of traffic datasets. Then we decompose these graph-based architectures to discuss their shared deep learning techniques, clarifying the utilization of each technique in traffic tasks. What's more, we summarize some common traffic challenges and the corresponding graph-based deep learning solutions to each challenge. Finally, we provide benchmark datasets, open source codes and future research directions in this rapidly growing field.

ROApr 18, 2025
Green Robotic Mixed Reality with Gaussian Splatting

Chenxuan Liu, He Li, Zongze Li et al.

Realizing green communication in robotic mixed reality (RoboMR) systems presents a challenge, due to the necessity of uploading high-resolution images at high frequencies through wireless channels. This paper proposes Gaussian splatting (GS) RoboMR (GSRMR), which achieves a lower energy consumption and makes a concrete step towards green RoboMR. The crux to GSRMR is to build a GS model which enables the simulator to opportunistically render a photo-realistic view from the robot's pose, thereby reducing the need for excessive image uploads. Since the GS model may involve discrepancies compared to the actual environments, a GS cross-layer optimization (GSCLO) framework is further proposed, which jointly optimizes content switching (i.e., deciding whether to upload image or not) and power allocation across different frames. The GSCLO problem is solved by an accelerated penalty optimization (APO) algorithm. Experiments demonstrate that the proposed GSRMR reduces the communication energy by over 10x compared with RoboMR. Furthermore, the proposed GSRMR with APO outperforms extensive baseline schemes, in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM).

ITOct 26, 2025
Edge Collaborative Gaussian Splatting with Integrated Rendering and Communication

Yujie Wan, Chenxuan Liu, Shuai Wang et al.

Gaussian splatting (GS) struggles with degraded rendering quality on low-cost devices. To address this issue, we present edge collaborative GS (ECO-GS), where each user can switch between a local small GS model to guarantee timeliness and a remote large GS model to guarantee fidelity. However, deciding how to engage the large GS model is nontrivial, due to the interdependency between rendering requirements and resource conditions. To this end, we propose integrated rendering and communication (IRAC), which jointly optimizes collaboration status (i.e., deciding whether to engage large GS) and edge power allocation (i.e., enabling remote rendering) under communication constraints across different users by minimizing a newly-derived GS switching function. Despite the nonconvexity of the problem, we propose an efficient penalty majorization minimization (PMM) algorithm to obtain the critical point solution. Furthermore, we develop an imitation learning optimization (ILO) algorithm, which reduces the computational time by over 100x compared to PMM. Experiments demonstrate the superiority of PMM and the real-time execution capability of ILO.

DCJul 23, 2025
BucketServe: Bucket-Based Dynamic Batching for Smart and Efficient LLM Inference Serving

Wanyi Zheng, Minxian Xu, Shengye Song et al.

Large language models (LLMs) have become increasingly popular in various areas, traditional business gradually shifting from rule-based systems to LLM-based solutions. However, the inference of LLMs is resource-intensive or latency-sensitive, posing significant challenges for serving systems. Existing LLM serving systems often use static or continuous batching strategies, which can lead to inefficient GPU memory utilization and increased latency, especially under heterogeneous workloads. These methods may also struggle to adapt to dynamic workload fluctuations, resulting in suboptimal throughput and potential service level objective (SLO) violations. In this paper, we introduce BucketServe, a bucket-based dynamic batching framework designed to optimize LLM inference performance. By grouping requests into size-homogeneous buckets based on sequence length, BucketServe minimizes padding overhead and optimizes GPU memory usage through real-time batch size adjustments preventing out-of-memory (OOM) errors. It introduces adaptive bucket splitting/merging and priority-aware scheduling to mitigate resource fragmentation and ensure SLO compliance. Experiment shows that BucketServe significantly outperforms UELLM in throughput, achieving up to 3.58x improvement. It can also handle 1.93x more request load under the SLO attainment of 80% compared with DistServe and demonstrates 1.975x higher system load capacity compared to the UELLM.

CLJun 16, 2025
Mixture of Weight-shared Heterogeneous Group Attention Experts for Dynamic Token-wise KV Optimization

Guanghui Song, Dongping Liao, Yiren Zhao et al.

Transformer models face scalability challenges in causal language modeling (CLM) due to inefficient memory allocation for growing key-value (KV) caches, which strains compute and storage resources. Existing methods like Grouped Query Attention (GQA) and token-level KV optimization improve efficiency but rely on rigid resource allocation, often discarding "low-priority" tokens or statically grouping them, failing to address the dynamic spectrum of token importance. We propose mixSGA, a novel mixture-of-expert (MoE) approach that dynamically optimizes token-wise computation and memory allocation. Unlike prior approaches, mixSGA retains all tokens while adaptively routing them to specialized experts with varying KV group sizes, balancing granularity and efficiency. Our key novelties include: (1) a token-wise expert-choice routing mechanism guided by learned importance scores, enabling proportional resource allocation without token discard; (2) weight-sharing across grouped attention projections to minimize parameter overhead; and (3) an auxiliary loss to ensure one-hot routing decisions for training-inference consistency in CLMs. Extensive evaluations across Llama3, TinyLlama, OPT, and Gemma2 model families show mixSGA's superiority over static baselines. On instruction-following and continued pretraining tasks, mixSGA achieves higher ROUGE-L and lower perplexity under the same KV budgets.

LGJul 4, 2021
Incorporating Reachability Knowledge into a Multi-Spatial Graph Convolution Based Seq2Seq Model for Traffic Forecasting

Jiexia Ye, Furong Zheng, Juanjuan Zhao et al.

Accurate traffic state prediction is the foundation of transportation control and guidance. It is very challenging due to the complex spatiotemporal dependencies in traffic data. Existing works cannot perform well for multi-step traffic prediction that involves long future time period. The spatiotemporal information dilution becomes serve when the time gap between input step and predicted step is large, especially when traffic data is not sufficient or noisy. To address this issue, we propose a multi-spatial graph convolution based Seq2Seq model. Our main novelties are three aspects: (1) We enrich the spatiotemporal information of model inputs by fusing multi-view features (time, location and traffic states) (2) We build multiple kinds of spatial correlations based on both prior knowledge and data-driven knowledge to improve model performance especially in insufficient or noisy data cases. (3) A spatiotemporal attention mechanism based on reachability knowledge is novelly designed to produce high-level features fed into decoder of Seq2Seq directly to ease information dilution. Our model is evaluated on two real world traffic datasets and achieves better performance than other competitors.

STMay 11, 2020
Multi-Graph Convolutional Network for Relationship-Driven Stock Movement Prediction

Jiexia Ye, Juanjuan Zhao, Kejiang Ye et al.

Stock price movement prediction is commonly accepted as a very challenging task due to the volatile nature of financial markets. Previous works typically predict the stock price mainly based on its own information, neglecting the cross effect among involved stocks. However, it is well known that an individual stock price is correlated with prices of other stocks in complex ways. To take the cross effect into consideration, we propose a deep learning framework, called Multi-GCGRU, which comprises graph convolutional network (GCN) and gated recurrent unit (GRU) to predict stock movement. Specifically, we first encode multiple relationships among stocks into graphs based on financial domain knowledge and utilize GCN to extract the cross effect based on these pre-defined graphs. To further get rid of prior knowledge, we explore an adaptive relationship learned by data automatically. The cross-correlation features produced by GCN are concatenated with historical records and then fed into GRU to model the temporal dependency of stock prices. Experiments on two stock indexes in China market show that our model outperforms other baselines. Note that our model is rather feasible to incorporate more effective stock relationships containing expert knowledge, as well as learn data-driven relationship.