Chen Zhi

SE
4papers
39citations
Novelty50%
AI Score46

4 Papers

79.1CRMay 27
SRAF: Stealthy and Robust Adversarial Fingerprint for Copyright Verification of Large Language Models

Zhebo Wang, Zhenhua Xu, Maike Li et al.

The protection of Intellectual Property (IP) for Large Language Models (LLMs) has become a critical concern as model theft and unauthorized commercialization escalate. While adversarial fingerprinting offers a promising black-box solution for ownership verification, existing methods suffer from significant limitations: they are fragile against downstream model modifications, sensitive to system prompt variations, and easily detectable due to high-perplexity input patterns. In this paper, we propose \textbf{SRAF}, a stealthy and robust adversarial fingerprinting framework. SRAF employs a synergistic joint optimization strategy across homologous model variants and diverse chat templates, forcing the fingerprint to anchor onto the invariant intrinsic comprehension features of the model family. Furthermore, we introduce a Perplexity Hiding technique that embeds adversarial perturbations within Markdown tables, effectively aligning the prompt's statistics with natural language to evade perplexity-based detection. Extensive experiments on the Llama-2 model family demonstrate that SRAF significantly enhances robustness against fine-tuning, alignment, pruning, merging, and input perturbations while maintaining exceptional stealthiness and low false-positive rates, offering a practical and resilient black-box solution for LLM ownership verification.

LGSep 28, 2023
Resisting Backdoor Attacks in Federated Learning via Bidirectional Elections and Individual Perspective

Zhen Qin, Feiyi Chen, Chen Zhi et al.

Existing approaches defend against backdoor attacks in federated learning (FL) mainly through a) mitigating the impact of infected models, or b) excluding infected models. The former negatively impacts model accuracy, while the latter usually relies on globally clear boundaries between benign and infected model updates. However, model updates are easy to be mixed and scattered throughout in reality due to the diverse distributions of local data. This work focuses on excluding infected models in FL. Unlike previous perspectives from a global view, we propose Snowball, a novel anti-backdoor FL framework through bidirectional elections from an individual perspective inspired by one principle deduced by us and two principles in FL and deep learning. It is characterized by a) bottom-up election, where each candidate model update votes to several peer ones such that a few model updates are elected as selectees for aggregation; and b) top-down election, where selectees progressively enlarge themselves through picking up from the candidates. We compare Snowball with state-of-the-art defenses to backdoor attacks in FL on five real-world datasets, demonstrating its superior resistance to backdoor attacks and slight impact on the accuracy of the global model.

SEMar 2, 2021Code
An Empirical Study of the Landscape of Open Source Projects in Baidu, Alibaba, and Tencent

Junxiao Han, Shuiguang Deng, David Lo et al.

Open source software has drawn more and more attention from researchers, developers and companies nowadays. Meanwhile, many Chinese technology companies are embracing open source and choosing to open source their projects. Nevertheless, most previous studies are concentrated on international companies such as Microsoft or Google, while the practical values of open source projects of Chinese technology companies remain unclear. To address this issue, we conduct a mixed-method study to investigate the landscape of projects open sourced by three large Chinese technology companies, namely Baidu, Alibaba, and Tencent (BAT). We study the categories and characteristics of open source projects, the developer's perceptions towards open sourcing effort for these companies, and the internationalization effort of their open source projects. We collected 1,000 open source projects that were open sourced by BAT in GitHub and performed an online survey that received 101 responses from developers of these projects. Some key findings include: 1) BAT prefer to open source frontend development projects, 2) 88\% of the respondents are positive towards open sourcing software projects in their respective companies, 3) 64\% of the respondents reveal that the most common motivations for BAT to open source their projects are the desire to gain fame, expand their influence and gain recruitment advantage, 4) respondents believe that the most common internationalization effort is "providing an English version of readme files", 5) projects with more internationalization effort (i.e., include an English readme file) are more popular. Our findings provide directions for software engineering researchers and provide practical suggestions to software developers and Chinese technology companies.

43.5SEApr 2
EpiDroid: Dependency-Guided Recomposition for Deep State Discovery in Mobile GUI Testing

Jiahui Song, Jiaxin Zhi, Kangjia Zhao et al.

The increasing scale and complexity of mobile applications make automated GUI exploration essential for software quality assurance. However, existing methods often neglect state dependencies between test fragments, which leads to redundant exploration and prevents access to deep application states. We introduce EpiDroid, a black-box, pluggable framework that augments existing explorers through semantic state dependency awareness. EpiDroid distills raw traces into stable test fragments to extract underlying dependencies. It then employs a Recomposition-Replay paradigm to perform impact reasoning via LLM and deterministic replay on high-value mutable state elements. Through iterative feedback, EpiDroid refines the state-dependency graph to systematically reach deep application states. We integrated EpiDroid into both industrial and state-of-the-art research tools and evaluated it on 20 real-world apps. The results show that EpiDroid consistently improves the performance of all baselines, increasing average code coverage by 10--28\% and delivering 3--4$\times$ more coverage gain compared to continuing the baselines alone from the same starting point. This demonstrates that dependency-guided recomposition unlocks deep states that forward exploration cannot access, irrespective of additional budget.