Weiheng Bai

4papers

1citation

Novelty53%

AI Score49

Ranked #48,577 of 201,326 authors (top 24%)#968 in CR (top 13%)

4 Papers

32.4CRApr 20

Capturing Monetarily Exploitable Vulnerability in Smart Contracts via Auditor Knowledge-Learning Fuzzing

Bowen Cai, Weiheng Bai, Hangyun Tang et al.

Smart contracts extended blockchain functionality beyond simple transactions, powering complex applications like decentralized finance (DeFi). However, this complexity introduces serious security challenges, including price manipulation and inflation attacks. Despite the development of various security tools, the rapid rise in financially motivated exploits continues to pose a significant threat to the blockchain ecosystem. These financially motivated exploits often stem from Monetarily Exploitable Vulnerabilities (MEVuls), which refer to vulnerabilities arising from exploitable implementations in monetary transactions or value-transfer logic. Due to their complexity, intricate chains of function calls, multifaceted logic, and diverse manifestations across different smart contracts, MEVuls are particularly challenging for current security tools to identify. Instead of providing actionable insights, existing tools frequently generate excessive warnings that overwhelm developers without effectively mitigating risks. To address the challenge of recognizing MEVuls, we first formalize MEVuls based on common real-world financial exploits. Then, we introduce FAUDITOR, a specialized fuzzer designed to detect MEVuls in smart contracts. The key insight is that leveraging smart contracts' finance-related interfaces directly exposes critical vulnerabilities, making detection more targeted. We further integrate auditors' reports using NLP to extract valuable insights on exploitation patterns, enabling a more informed search strategy. Additionally, FAUDITOR employs a self-learning mechanism that refines its detection strategies over time, allowing it to improve based on prior fuzzing results. In our evaluation, FAUDITOR impressively reveals 220 zero-day MEVuls. Meanwhile, compared to existing fuzzers, FAUDITOR detects vulnerabilities faster and achieves better instruction coverage.

51.7CRApr 28Code

GenDetect: Generalizing Reactive Detection for Resilience Against Imitative DeFi Attack Cascade

Bowen Cai, Weiheng Bai, Youshui Lu et al.

As blockchain ecosystems grow, financially motivated attackers increasingly exploit decentralized finance (DeFi) protocols, causing frequent and severe losses. Unlike conventional cyberattacks, DeFi exploits propagate rapidly due to the transparent and composable nature of smart contracts. We identify a critical pattern, Imitative Attack Cascade: an initial successful exploit is quickly followed by mimicking transactions that reuse attack logic with minor modifications or parameter changes. Our empirical analysis shows that over 69% of DeFi attacks exhibit strong behavioral similarity to earlier incidents, often within hours or days of the initial attack. This exposes a fundamental limitation in current reactive detection. Initial attacks are typically flagged via heuristic alerts (Tornado Cash traces, anomalous nonce usage, exploiter labels), but turning these signals into detection rules requires manual validation and handcrafted trace analysis -- a labor-intensive, slow process that leaves follow-up attacks to spread. Our goal is to ensure that once an attack has been observed, even a single instance, it can be rapidly abstracted into an actionable, generalizable detection rule. We decompose the problem into two challenges: (I) abstracting the semantics of diverse, obscure function signatures, and (II) matching transaction logic in noisy, evasive traces. We leverage two insights: (i) the open-source nature of most DeFi protocols enables high-fidelity semantic classification of function signatures; (ii) contract labels isolate essential logic by filtering irrelevant calls and classifying attack intent. Building on these, we develop GenDetect, which achieves ACC 98%, FPR 1%, FNR 3% and discovers 56 previously unrevealed attacks from the past three years. Source code and dataset: https://github.com/NobodyIsAnonymous/GenDetect_ICSE2026

72.3CRMar 10

Compatibility at a Cost: Systematic Discovery and Exploitation of MCP Clause-Compliance Vulnerabilities

Nanzi Yang, Weiheng Bai, Kangjie Lu

The Model Context Protocol (MCP) is a recently proposed interoperability standard that unifies how AI agents connect with external tools and data sources. By defining a set of common client-server message exchange clauses, MCP replaces fragmented integrations with a standardized, plug-and-play framework. However, to be compatible with diverse AI agents, the MCP specification relaxes many behavioral constraints into optional clauses, leading to misuse-prone SDK implementation. We identify it as a new attack surface that allows adversaries to achieve multiple attacks (e.g, silent prompt injection, DoS, etc.), named as \emph{compatibility-abusing attacks}. In this work, we present the first systematic framework for analyzing this new attack surface across multi-language MCP SDKs. First, we construct a universal and language-agnostic intermediate representation (IR) generator that normalizes SDKs of different languages. Next, based on the new IR, we propose auditable static analysis with LLM-guided semantic reasoning for cross-language/clause compliance analysis. Third, by formalizing the attack semantics of the MCP clauses, we build three attack modalities and develop a modality-guided pipeline to uncover exploitable non-compliance issues.

57.3LGMay 7

How to Compress KV Cache in RL Post-Training? Shadow Mask Distillation for Memory-Efficient Alignment

Rui Zhu, Weiheng Bai, Qiushi Wu et al.

Reinforcement Learning (RL) has emerged as a crucial paradigm for unlocking the advanced reasoning capabilities of Large Language Models (LLMs), encompassing frameworks like RLHF and RLAIF. Regardless of the specific optimization algorithm (e.g., PPO, GRPO, or Online DPO), online RL inherently requires an exploratory trajectory generation (rollout) phase. However, for long-context reasoning tasks, this rollout phase imposes a severe ``memory wall'' due to the exorbitant Key-Value (KV) cache footprint. While applying KV cache compression during rollouts mitigates this memory overhead, it induces a critical off-policy bias. Although modern KV compression is often nearly lossless during standard inference, even minuscule approximation errors are drastically amplified by the inherent instability of RL optimization. Specifically, the sampler generates responses under a sparse context, whereas the learner updates parameters using the full, dense context. Existing statistical solutions, such as importance reweighting, struggle to correct this magnified bias, suffering from high gradient variance and severe sample inefficiency.