RMFeb 19, 2023
Auto.gov: Learning-based Governance for Decentralized Finance (DeFi)Jiahua Xu, Yebo Feng, Daniel Perez et al.
Decentralized finance (DeFi) is an integral component of the blockchain ecosystem, enabling a range of financial activities through smart-contract-based protocols. Traditional DeFi governance typically involves manual parameter adjustments by protocol teams or token holder votes, and is thus prone to human bias and financial risks, undermining the system's integrity and security. While existing efforts aim to establish more adaptive parameter adjustment schemes, there remains a need for a governance model that is both more efficient and resilient to significant market manipulations. In this paper, we introduce "Auto$.$gov", a learning-based governance framework that employs a deep Qnetwork (DQN) reinforcement learning (RL) strategy to perform semi-automated, data-driven parameter adjustments. We create a DeFi environment with an encoded action-state space akin to the Aave lending protocol for simulation and testing purposes, where Auto$.$gov has demonstrated the capability to retain funds that would have otherwise been lost to price oracle attacks. In tests with real-world data, Auto$.$gov outperforms the benchmark approaches by at least 14% and the static baseline model by tenfold, in terms of the preset performance metric--protocol profitability. Overall, the comprehensive evaluations confirm that Auto$.$gov is more efficient and effective than traditional governance methods, thereby enhancing the security, profitability, and ultimately, the sustainability of DeFi protocols.
CRAug 8, 2024
Eliminating Backdoors in Neural Code Models for Secure Code UnderstandingWeisong Sun, Yuchen Chen, Chunrong Fang et al.
Neural code models (NCMs) have been widely used to address various code understanding tasks, such as defect detection. However, numerous recent studies reveal that such models are vulnerable to backdoor attacks. Backdoored NCMs function normally on normal/clean code snippets, but exhibit adversary-expected behavior on poisoned code snippets injected with the adversary-crafted trigger. It poses a significant security threat. Therefore, there is an urgent need for effective techniques to detect and eliminate backdoors stealthily implanted in NCMs. To address this issue, in this paper, we innovatively propose a backdoor elimination technique for secure code understanding, called EliBadCode. EliBadCode eliminates backdoors in NCMs by inverting/reverse-engineering and unlearning backdoor triggers. Specifically, EliBadCode first filters the model vocabulary for trigger tokens based on the naming conventions of specific programming languages to reduce the trigger search space and cost. Then, EliBadCode introduces a sample-specific trigger position identification method, which can reduce the interference of non-backdoor (adversarial) perturbations for subsequent trigger inversion, thereby producing effective inverted backdoor triggers efficiently. Backdoor triggers can be viewed as backdoor (adversarial) perturbations. Subsequently, EliBadCode employs a Greedy Coordinate Gradient algorithm to optimize the inverted trigger and designs a trigger anchoring method to purify the inverted trigger. Finally, EliBadCode eliminates backdoors through model unlearning. We evaluate the effectiveness of EliBadCode in eliminating backdoors implanted in multiple NCMs used for three safety-critical code understanding tasks. The results demonstrate that EliBadCode can effectively eliminate backdoors while having minimal adverse effects on the normal functionality of the model.
97.2CRMay 4
Don't Trust Your Upstream: Exploiting LLM Multi-Agent System via Topology-Guided Adversarial PropagationRuichao Liang, Le Yin, Jing Chen et al.
The digital world is witnessing the rapid rise of LLM-based multi-agent systems (MASs) and their powerful applications. However, their security remains insufficiently understood, as existing evaluations are largely limited to narrow attack settings and may substantially underestimate the real risks of MAS deployments. Inspired by the MAS inter-agent dependencies, where upstream outputs are reinterpreted and executed by downstream agents, we propose a topology-aware attack scheme that propagates adversarial contamination from exposed edge agents to high-privilege agents to induce malicious behaviors. By combining topology reconnaissance, contamination propagation modeling, and hierarchical payload encapsulation, our approach overcomes the key challenges of black-box attacks and makes such multi-hop compromise practical. Experiments show that our approach achieves success rates of 40\%--78\% on three widely-used MAS frameworks under five topologies, and 85\% on two real-world MAS applications across 20 representative scenarios. The results reveal fundamental vulnerabilities in MASs that have been overlooked by prior studies. Based on these findings, we propose a topology-trust mitigation that blocks 94.8\% of such composite attacks.
98.0CRMay 6
Sealing the Audit-Runtime Gap for LLM SkillsTingda Shen, Yebo Feng, Konglin Zhu et al.
Large language model (LLM) ecosystems such as Claude Code and ChatGPT increasingly rely on skills: packages of natural-language instructions and executable tools. Once in the LLM's context, skill content cannot be reliably separated from trusted instructions, and a skill's executable side can invoke privileged actions, exposing the skill supply chain to injection, tampering, and rug-pull attacks. Existing defenses are stage-bound: centralized signing, audit reports unbound from the runtime artifact, or policy engines that cannot attest to what was approved. We present SIGIL, the first framework that seals the audit-runtime gap for LLM skills. SIGIL delivers verifiable hosting through a tamper-evident, decentralized on-chain registry from which LLMs fetch skills directly. The registry admits four publication types, Transparent, Licensed, Sealed, and Committed, spanning plaintext public distribution, monetized access, custodial use, and off-chain workflows; before admission, every skill is vetted by a Decentralized Autonomous Organization (DAO) audit committee that supports pluggable auditing methods under a stake-and-slash economic model. At load time, SIGIL delivers verified loading through a skill verification protocol executed by a Skill Verification Loader (SVL) embedded as the mandatory loading path: the SVL retrieves and decrypts the skill as its type requires, verifies its integrity against the on-chain record, and enforces its permission manifest before context injection. We evaluate SIGIL on a real-world deployment against 1,023 in-the-wild skills spanning six attack types. At load time, the SVL verifies each skill's integrity against its on-chain record and enforces its approved permission manifest, completing batched verification under 86 ms. Together, these results show that LLM skills can be cryptographically bound from publication through runtime at practical cost.
98.2CRMay 18
Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized SamplingZiwei Wang, Jing Chen, Ruichao Liang et al.
Despite rigorous safety alignment, Large Language Models (LLMs) remain vulnerable to jailbreak attacks. Existing black-box methods often rely on heuristic templates or exhaustive trials, lacking mechanistic interpretability and query efficiency. In this study, we investigate an intrinsic vulnerability in the safety mechanisms of LLMs, where safety alignment relies on a small set of sparsely distributed attention heads, leaving much of the representational space weakly monitored. We formalize this phenomenon with a mathematical jailbreaking model that characterizes the delicate boundary of effective text obfuscation and analytically explains observed jailbreak behaviors. Guided by this model, we propose Babel, an efficient black-box attack framework that exploits the identified safety gap through systematic obfuscation sampling with iterative, feedback-driven distribution refinement, enabling reliable and high-success jailbreak attacks without access to model internals. Comprehensive evaluations on frontier commercial models demonstrate that Babel achieves state-of-the-art attack success rates and superior query efficiency. Specifically, compared to state-of-the-art methods, Babel increases the attack success rate on GPT-4o from 41.33% to 82.67% and on Claude-3-5-haiku from 38.33% to 78.33% within an average of 40 queries, providing a robust red-teaming methodology for LLMs safety research.
AIJan 13
Resisting Manipulative Bots in Memecoin Copy Trading: A Multi-Agent Approach with Chain-of-Thought ReasoningYichen Luo, Yebo Feng, Jiahua Xu et al.
The launch of \$Trump coin ignited a wave in meme coin investment. Copy trading, as a strategy-agnostic approach that eliminates the need for deep trading knowledge, quickly gains widespread popularity in the meme coin market. However, copy trading is not a guarantee of profitability due to the prevalence of manipulative bots, the uncertainty of the followed wallets' future performance, and the lag in trade execution. Recently, large language models (LLMs) have shown promise in financial applications by effectively understanding multi-modal data and producing explainable decisions. However, a single LLM struggles with complex, multi-faceted tasks such as asset allocation. These challenges are even more pronounced in cryptocurrency markets, where LLMs often lack sufficient domain-specific knowledge in their training data. To address these challenges, we propose an explainable multi-agent system for meme coin copy trading. Inspired by the structure of an asset management team, our system decomposes the complex task into subtasks and coordinates specialized agents to solve them collaboratively. Employing few-shot chain-of-though (CoT) prompting, each agent acquires professional meme coin trading knowledge, interprets multi-modal data, and generates explainable decisions. Using a dataset of 1,000 meme coin projects' transaction data, our empirical evaluation shows that the proposed multi-agent system outperforms both traditional machine learning models and single LLMs, achieving 73% and 70% precision in identifying high-quality meme coin projects and key opinion leader (KOL) wallets, respectively. The selected KOLs collectively generated a total profit of \$500,000 across these projects.
3.9DCApr 23
Systematizing Blockchain Research Themes and Design Patterns: Insights from the University Blockchain Research Initiative (UBRI)Chien-Chih Chen, Yitian Wang, Emma Nasseri et al.
The rapid expansion of blockchain and digital asset ecosystems has intensified the challenge of translating academic research into deployable systems and regulatory frameworks. While advances in cryptography, consensus, digital assets, and governance are substantial, institutional mechanisms that sustain research-to-deployment translation at ecosystem scale remain comparatively under-theorized. This paper examines the architectural and coordination patterns that enable such translation, using the University Blockchain Research Initiative (UBRI) network as a representative case of long-term academic and industry collaboration. Drawing on research outputs and convenings from 2022 to 2025, we synthesize recurring design tensions across technical and institutional domains, including scalability versus security, decentralization versus governance, and privacy versus compliance. Rather than cataloging individual projects, we abstract system-level themes that connect research contributions to deployment constraints and policy adaptation, providing a structured lens for understanding how academic research informs production architectures, regulatory development, and ecosystem resilience in emerging decentralized infrastructures.
92.7CRMay 4
EvoPoC: Automated Exploit Synthesis for DeFi Smart Contracts via Hierarchical Knowledge GraphsRuichao Liang, Jing Chen, Xianglong Li et al.
Smart contract vulnerabilities in Decentralized Finance caused over billions of dollars losses every year, yet the security community faces a critical bottleneck: identifying a vulnerability is not the same as proving it is exploitable. Manual PoC construction is prohibitively labor-intensive, leaving most disclosed vulnerabilities unverified and protocols exposed long before mitigation is applied. In this paper, we propose \sys, a knowledge-driven agentic system for end-to-end contract vulnerability detection and exploit synthesis. Our core insight is that exploit synthesis is not a code generation task but a \emph{structured reasoning problem} that requires grounded knowledge of protocol semantics, failure root cause, and exploit primitives. \sys organizes this knowledge into a \emph{Hierarchical Knowledge Graph} (HKG) that serves as structured memory for LLM-guided multi-hop reasoning. To validate exploit feasibility beyond code synthesis, \sys employs a two-stage validation framework that checks exploit-path reachability via SMT solving and profit realizability via asset-level state simulation, ensuring generated PoCs satisfy both logical and economic viability constraints. Evaluated on 88 real-world DeFi attacks and 72 audited projects (2,573 contracts), \sys achieves 98\% recall and 0.9 F1-score in detection, and a 96.6\% exploit success rate (ESR), reproducing 85 historical exploits and recovering over \$116.2M revenue. \sys outperforms SOTA fuzzers (\textsc{Verite}, \textsc{ItyFuzz}) by up to $5\times$ in ESR and $300\times$ in recoverable value, and the LLM-based exploit generator \textsc{A1} by $2\times$ and $8.5\times$ respectively. In bug bounty evaluation, \sys identified 16 confirmed 0-day vulnerabilities, helping secure over \$70.6M and earning \$2,900 in bounties.
TRJan 1, 2025
LLM-Powered Multi-Agent System for Automated Crypto Portfolio ManagementYichen Luo, Yebo Feng, Jiahua Xu et al.
Cryptocurrency investment is inherently difficult due to its shorter history compared to traditional assets, the need to integrate vast amounts of data from various modalities, and the requirement for complex reasoning. While deep learning approaches have been applied to address these challenges, their black-box nature raises concerns about trust and explainability. Recently, large language models (LLMs) have shown promise in financial applications due to their ability to understand multi-modal data and generate explainable decisions. However, single LLM faces limitations in complex, comprehensive tasks such as asset investment. These limitations are even more pronounced in cryptocurrency investment, where LLMs have less domain-specific knowledge in their training corpora. To overcome these challenges, we propose an explainable, multi-modal, multi-agent framework for cryptocurrency investment. Our framework uses specialized agents that collaborate within and across teams to handle subtasks such as data analysis, literature integration, and investment decision-making for the top 30 cryptocurrencies by market capitalization. The expert training module fine-tunes agents using multi-modal historical data and professional investment literature, while the multi-agent investment module employs real-time data to make informed cryptocurrency investment decisions. Unique intrateam and interteam collaboration mechanisms enhance prediction accuracy by adjusting final predictions based on confidence levels within agent teams and facilitating information sharing between teams. Empirical evaluation using data from November 2023 to September 2024 demonstrates that our framework outperforms single-agent models and market benchmarks in classification, asset pricing, portfolio, and explainability performance.
LGMar 6, 2024
On the Effectiveness of Distillation in Mitigating Backdoors in Pre-trained EncoderTingxu Han, Shenghan Huang, Ziqi Ding et al.
In this paper, we study a defense against poisoned encoders in SSL called distillation, which is a defense used in supervised learning originally. Distillation aims to distill knowledge from a given model (a.k.a the teacher net) and transfer it to another (a.k.a the student net). Now, we use it to distill benign knowledge from poisoned pre-trained encoders and transfer it to a new encoder, resulting in a clean pre-trained encoder. In particular, we conduct an empirical study on the effectiveness and performance of distillation against poisoned encoders. Using two state-of-the-art backdoor attacks against pre-trained image encoders and four commonly used image classification datasets, our experimental results show that distillation can reduce attack success rate from 80.87% to 27.51% while suffering a 6.35% loss in accuracy. Moreover, we investigate the impact of three core components of distillation on performance: teacher net, student net, and distillation loss. By comparing 4 different teacher nets, 3 student nets, and 6 distillation losses, we find that fine-tuned teacher nets, warm-up-training-based student nets, and attention-based distillation loss perform best, respectively.
AIApr 26, 2025
A Vision for Auto Research with LLM AgentsChengwei Liu, Chong Wang, Jiayue Cao et al.
This paper introduces Agent-Based Auto Research, a structured multi-agent framework designed to automate, coordinate, and optimize the full lifecycle of scientific research. Leveraging the capabilities of large language models (LLMs) and modular agent collaboration, the system spans all major research phases, including literature review, ideation, methodology planning, experimentation, paper writing, peer review response, and dissemination. By addressing issues such as fragmented workflows, uneven methodological expertise, and cognitive overload, the framework offers a systematic and scalable approach to scientific inquiry. Preliminary explorations demonstrate the feasibility and potential of Auto Research as a promising paradigm for self-improving, AI-driven research processes.
SEMar 13, 2025
Commenting Higher-level Code Unit: Full Code, Reduced Code, or Hierarchical Code SummarizationWeisong Sun, Yiran Zhang, Jie Zhu et al.
Commenting code is a crucial activity in software development, as it aids in facilitating future maintenance and updates. To enhance the efficiency of writing comments and reduce developers' workload, researchers has proposed various automated code summarization (ACS) techniques to automatically generate comments/summaries for given code units. However, these ACS techniques primarily focus on generating summaries for code units at the method level. There is a significant lack of research on summarizing higher-level code units, such as file-level and module-level code units, despite the fact that summaries of these higher-level code units are highly useful for quickly gaining a macro-level understanding of software components and architecture. To fill this gap, in this paper, we conduct a systematic study on how to use LLMs for commenting higher-level code units, including file level and module level. These higher-level units are significantly larger than method-level ones, which poses challenges in handling long code inputs within LLM constraints and maintaining efficiency. To address these issues, we explore various summarization strategies for ACS of higher-level code units, which can be divided into three types: full code summarization, reduced code summarization, and hierarchical code summarization. The experimental results suggest that for summarizing file-level code units, using the full code is the most effective approach, with reduced code serving as a cost-efficient alternative. However, for summarizing module-level code units, hierarchical code summarization becomes the most promising strategy. In addition, inspired by the research on method-level ACS, we also investigate using the LLM as an evaluator to evaluate the quality of summaries of higher-level code units. The experimental results demonstrate that the LLM's evaluation results strongly correlate with human evaluations.
CYMar 3, 2025
\textsc{Perseus}: Tracing the Masterminds Behind Cryptocurrency Pump-and-Dump SchemesHonglin Fu, Yebo Feng, Cong Wu et al.
Masterminds are entities organizing, coordinating, and orchestrating cryptocurrency pump-and-dump schemes, a form of trade-based manipulation undermining market integrity and causing financial losses for unwitting investors. Previous research detects pump-and-dump activities in the market, predicts the target cryptocurrency, and examines investors and \ac{osn} entities. However, these solutions do not address the root cause of the problem. There is a critical gap in identifying and tracing the masterminds involved in these schemes. In this research, we develop a detection system \textsc{Perseus}, which collects real-time data from the \acs{osn} and cryptocurrency markets. \textsc{Perseus} then constructs temporal attributed graphs that preserve the direction of information diffusion and the structure of the community while leveraging \ac{gnn} to identify the masterminds behind pump-and-dump activities. Our design of \textsc{Perseus} leads to higher F1 scores and precision than the \ac{sota} fraud detection method, achieving fast training and inferring speeds. Deployed in the real world from February 16 to October 9 2024, \textsc{Perseus} successfully detects $438$ masterminds who are efficient in the pump-and-dump information diffusion networks. \textsc{Perseus} provides regulators with an explanation of the risks of masterminds and oversight capabilities to mitigate the pump-and-dump schemes of cryptocurrency.
LGNov 11, 2024
HeteroSample: Meta-path Guided Sampling for Heterogeneous Graph Representation LearningAo Liu, Jing Chen, Ruiying Du et al.
The rapid expansion of Internet of Things (IoT) has resulted in vast, heterogeneous graphs that capture complex interactions among devices, sensors, and systems. Efficient analysis of these graphs is critical for deriving insights in IoT scenarios such as smart cities, industrial IoT, and intelligent transportation systems. However, the scale and diversity of IoT-generated data present significant challenges, and existing methods often struggle with preserving the structural integrity and semantic richness of these complex graphs. Many current approaches fail to maintain the balance between computational efficiency and the quality of the insights generated, leading to potential loss of critical information necessary for accurate decision-making in IoT applications. We introduce HeteroSample, a novel sampling method designed to address these challenges by preserving the structural integrity, node and edge type distributions, and semantic patterns of IoT-related graphs. HeteroSample works by incorporating the novel top-leader selection, balanced neighborhood expansion, and meta-path guided sampling strategies. The key idea is to leverage the inherent heterogeneous structure and semantic relationships encoded by meta-paths to guide the sampling process. This approach ensures that the resulting subgraphs are representative of the original data while significantly reducing computational overhead. Extensive experiments demonstrate that HeteroSample outperforms state-of-the-art methods, achieving up to 15% higher F1 scores in tasks such as link prediction and node classification, while reducing runtime by 20%.These advantages make HeteroSample a transformative tool for scalable and accurate IoT applications, enabling more effective and efficient analysis of complex IoT systems, ultimately driving advancements in smart cities, industrial IoT, and beyond.
94.4CYApr 1
A Visionary Look at Vibe ResearchingYebo Feng, Yang Liu
Vibe researching is an emerging paradigm in which human researchers provide high-level direction and critical judgment while LLM-based agents handle the labor-intensive execution of literature review, experimentation, data analysis, and manuscript drafting. Inspired by the "vibe coding" movement in software engineering, it occupies a middle ground between traditional manual research and fully autonomous AI research systems. This paper defines the concept, describes its methodology (multi-agent architectures, memory, tool use, retrieval-augmented generation, and the human's role as orchestrator), identifies seven technical limitations, weighs its positive and negative societal impacts, and maps each problem to a concrete future direction. Our goal is to provide the research community with a clear and honest map of the territory so that the conversation about responsible adoption can start from shared ground.
CRJan 20, 2022
CoAvoid: Secure, Privacy-Preserved Tracing of Contacts for Infectious DiseasesTeng Li, Siwei Yin, Runze Yu et al.
To fight against infectious diseases (e.g., SARS, COVID-19, Ebola, etc.), government agencies, technology companies and health institutes have launched various contact tracing approaches to identify and notify the people exposed to infection sources. However, existing tracing approaches can lead to severe privacy and security concerns, thereby preventing their secure and widespread use among communities. To tackle these problems, this paper proposes CoAvoid, a decentralized, privacy-preserved contact tracing system that features good dependability and usability. CoAvoid leverages the Google/Apple Exposure Notification (GAEN) API to achieve decent device compatibility and operating efficiency. It utilizes GPS along with Bluetooth Low Energy (BLE) to dependably verify user information. In addition, to enhance privacy protection, CoAvoid applies fuzzification and obfuscation measures to shelter sensitive data, making both servers and users agnostic to information of both low and high-risk populations. The evaluation demonstrates good efficacy and security of CoAvoid. Compared with four state-of-art contact tracing applications, CoAvoid can reduce upload data by at least 90% and simultaneously resist wormhole and replay attacks in various scenarios.