Jizhong Zhao

CR
6papers
122citations
Novelty52%
AI Score46

6 Papers

54.1CRMar 29Code
SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

He Yang, Dongyi Lv, Song Ma et al.

Dataset condensation aims to synthesize compact yet informative datasets that retain the training efficacy of full-scale data, offering substantial gains in efficiency. Recent studies reveal that the condensation process can be vulnerable to backdoor attacks, where malicious triggers are injected into the condensation dataset, manipulating model behavior during inference. While prior approaches have made progress in balancing attack success rate and clean test accuracy, they often fall short in preserving stealthiness, especially in concealing the visual artifacts of condensed data or the perturbations introduced during inference. To address this challenge, we introduce Sneakdoor, which enhances stealthiness without compromising attack effectiveness. Sneakdoor exploits the inherent vulnerability of class decision boundaries and incorporates a generative module that constructs input-aware triggers aligned with local feature geometry, thereby minimizing detectability. This joint design enables the attack to remain imperceptible to both human inspection and statistical detection. Extensive experiments across multiple datasets demonstrate that Sneakdoor achieves a compelling balance among attack success rate, clean test accuracy, and stealthiness, substantially improving the invisibility of both the synthetic data and triggered samples while maintaining high attack efficacy. The code is available at https://github.com/XJTU-AI-Lab/SneakDoor.

LGNov 27, 2023
UFDA: Universal Federated Domain Adaptation with Practical Assumptions

Xinhui Liu, Zhenghao Chen, Luping Zhou et al.

Conventional Federated Domain Adaptation (FDA) approaches usually demand an abundance of assumptions, which makes them significantly less feasible for real-world situations and introduces security hazards. This paper relaxes the assumptions from previous FDAs and studies a more practical scenario named Universal Federated Domain Adaptation (UFDA). It only requires the black-box model and the label set information of each source domain, while the label sets of different source domains could be inconsistent, and the target-domain label set is totally blind. Towards a more effective solution for our newly proposed UFDA scenario, we propose a corresponding methodology called Hot-Learning with Contrastive Label Disambiguation (HCLD). It particularly tackles UFDA's domain shifts and category gaps problems by using one-hot outputs from the black-box models of various source domains. Moreover, to better distinguish the shared and unknown classes, we further present a cluster-level strategy named Mutual-Voting Decision (MVD) to extract robust consensus knowledge across peer classes from both source and target domains. Extensive experiments on three benchmark datasets demonstrate that our method achieves comparable performance for our UFDA scenario with much fewer assumptions, compared to previous methodologies with comprehensive additional assumptions.

32.4LGApr 16
FedIDM: Achieving Fast and Stable Convergence in Byzantine Federated Learning through Iterative Distribution Matching

He Yang, Dongyi Lv, Wei Xi et al.

Most existing Byzantine-robust federated learning (FL) methods suffer from slow and unstable convergence. Moreover, when handling a substantial proportion of colluded malicious clients, achieving robustness typically entails compromising model utility. To address these issues, this work introduces FedIDM, which employs distribution matching to construct trustworthy condensed data for identifying and filtering abnormal clients. FedIDM consists of two main components: (1) attack-tolerant condensed data generation, and (2) robust aggregation with negative contribution-based rejection. These components exclude local updates that (1) deviate from the update direction derived from condensed data, or (2) cause a significant loss on the condensed dataset. Comprehensive evaluations on three benchmark datasets demonstrate that FedIDM achieves fast and stable convergence while maintaining acceptable model utility, under multiple state-of-the-art Byzantine attacks involving a large number of malicious clients.

CLJan 26, 2022
On the Effectiveness of Pinyin-Character Dual-Decoding for End-to-End Mandarin Chinese ASR

Zhao Yang, Dianwen Ng, Xiao Fu et al.

End-to-end automatic speech recognition (ASR) has achieved promising results. However, most existing end-to-end ASR methods neglect the use of specific language characteristics. For Mandarin Chinese ASR tasks, there exist mutual promotion relationship between Pinyin and Character where Chinese characters can be romanized by Pinyin. Based on the above intuition, we first investigate types of end-to-end encoder-decoder based models in the single-input dual-output (SIDO) multi-task framework, after which a novel asynchronous decoding with fuzzy Pinyin sampling method is proposed according to the one-to-one correspondence characteristics between Pinyin and Character. Furthermore, we proposed a two-stage training strategy to make training more stable and converge faster. The results on the test sets of AISHELL-1 dataset show that the proposed enhanced dual-decoder model without a language model is improved by a big margin compared to strong baseline models.

CRAug 3, 2012
Efficient and Secure Key Extraction using CSI without Chasing down Errors

Jizhong Zhao, Wei Xi, Jinsong Han et al.

Generating keys and keeping them secret is critical in secure communications. Due to the "open-air" nature, key distribution is more susceptible to attacks in wireless communications. An ingenious solution is to generate common secret keys by two communicating parties separately without the need of key exchange or distribution, and regenerate them on needs. Recently, it is promising to extract keys by measuring the random variation in wireless channels, e.g., RSS. In this paper, we propose an efficient Secret Key Extraction protocol without Chasing down Errors, SKECE. It establishes common cryptographic keys for two communicating parties in wireless networks via the realtime measurement of Channel State Information (CSI). It outperforms RSS-based approaches for key generation in terms of multiple subcarriers measurement, perfect symmetry in channel, rapid decorrelation with distance, and high sensitivity towards environments. In the SKECE design, we also propose effective mechanisms such as the adaptive key stream generation, leakage resilient consistence validation, and weighted key recombination, to fully exploit the excellent properties of CSI. We implement SKECE on off-the-shelf 802.11n devices and evaluate its performance via extensive experiments. The results demonstrate that SKECE achieves a more than 3x throughput gain in the key generation from one subcarrier in static scenarios, and due to its high efficiency, a 50% reduction on the communication overhead compared to the state-of-the-art RSS based approaches.

NIAug 2, 2012
Rejecting the Attack: Source Authentication for Wi-Fi Management Frames using CSI Information

Zhiping Jiang, Jizhong Zhao, Xiang-Yang Li et al.

Comparing to well protected data frames, Wi-Fi management frames (MFs) are extremely vulnerable to various attacks. Since MFs are transmitted without encryption, attackers can forge them easily. Such attacks can be detected in cooperative environment such as Wireless Intrusion Detection System (WIDS). However, in non-cooperative environment it is difficult for a single station to identify these spoofing attacks using Received Signal Strength (RSS)-based detection, due to the strong correlation of RSS to both the transmission power (Txpower) and the location of the sender. By exploiting some unique characteristics (i.e., rapid spatial decorrelation, independence of Txpower, and much richer dimensions) of the Channel State Information (CSI), a standard feature in 802.11n Specification, we design a prototype, called CSITE, to authenticate the Wi-Fi management frames by a single station without external support. Our design CSITE, built upon off-the-shelf hardware, achieves precise spoofing detection without collaboration and in-advance finger-print. Several novel techniques are designed to address the challenges caused by user mobility and channel dynamics. To verify the performances of our solution, we implement a prototype of our design and conduct extensive evaluations in various scenarios. Our test results show that our design significantly outperforms the RSS-based method in terms of accuracy, robustness, and efficiency: we observe about 8 times improvement by CSITE over RSS-based method on the falsely accepted attacking frames.