SEMay 27
SCDBench: A Benchmark for LLM-Based Smart Contract DecompilersKaihua Qin, Dawn Song, Arthur Gervais
Smart contract decompilation aims to recover high-level source code from bytecode, but evaluating decompilers remains difficult because existing studies use narrow datasets, inconsistent metrics, and limited semantic consistency checks. This gap is increasingly important as large language models (LLMs) begin to generate source-like Solidity that may compile and appear plausible, even when its semantics diverge from the original contract. We introduce SCDBench, a dataset and benchmark methodology for LLM-based smart contract decompilation. The dataset contains 600 real-world Solidity contracts with paired bytecode inputs, ground-truth source code, and replayable semantic checkpoints. SCDBench evaluates decompiler outputs through four cumulative stages: format completeness, compilability, Application Binary Interface (ABI) recovery, and semantic consistency via differential replay. We evaluate Claude Opus 4.7, GPT-5.3-Codex, and GLM-5 in a zero-shot decompilation setting, including GLM-5 variants with and without extended reasoning and a zero-shot compilation-repair setting. The results show that frontier LLMs can often produce structured and compilable Solidity, but achieving semantic consistency remains far from solved: the best-performing frontier model perfectly decompiles only 42/600 contracts. We further show that introducing same-model compilation repair substantially improves performance at modest additional cost. SCDBench establishes a common ground for rigorous, reproducible evaluation and aims to accelerate the development of reliable smart contract decompilers for blockchain security and transparency.
CRApr 25, 2023
Blockchain Large Language ModelsYu Gai, Liyi Zhou, Kaihua Qin et al.
This paper presents a dynamic, real-time approach to detecting anomalous blockchain transactions. The proposed tool, BlockGPT, generates tracing representations of blockchain activity and trains from scratch a large language model to act as a real-time Intrusion Detection System. Unlike traditional methods, BlockGPT is designed to offer an unrestricted search space and does not rely on predefined rules or patterns, enabling it to detect a broader range of anomalies. We demonstrate the effectiveness of BlockGPT through its use as an anomaly detection tool for Ethereum transactions. In our experiments, it effectively identifies abnormal transactions among a dataset of 68M transactions and has a batched throughput of 2284 transactions per second on average. Our results show that, BlockGPT identifies abnormal transactions by ranking 49 out of 124 attacks among the top-3 most abnormal transactions interacting with their victim contracts. This work makes contributions to the field of blockchain transaction analysis by introducing a custom data encoding compatible with the transformer architecture, a domain-specific tokenization technique, and a tree encoding method specifically crafted for the Ethereum Virtual Machine (EVM) trace representation.
CRFeb 1
TxRay: Agentic Postmortem of Live Blockchain AttacksZiyue Wang, Jiangshan Yu, Kaihua Qin et al.
Decentralized Finance (DeFi) has turned blockchains into financial infrastructure, allowing anyone to trade, lend, and build protocols without intermediaries, but this openness exposes pools of value controlled by code. Within five years, the DeFi ecosystem has lost over 15.75B USD to reported exploits. Many exploits arise from permissionless opportunities that any participant can trigger using only public state and standard interfaces, which we call Anyone-Can-Take (ACT) opportunities. Despite on-chain transparency, postmortem analysis remains slow and manual: investigations start from limited evidence, sometimes only a single transaction hash, and must reconstruct the exploit lifecycle by recovering related transactions, contract code, and state dependencies. We present TxRay, a Large Language Model (LLM) agentic postmortem system that uses tool calls to reconstruct live ACT attacks from limited evidence. Starting from one or more seed transactions, TxRay recovers the exploit lifecycle, derives an evidence-backed root cause, and generates a runnable, self-contained Proof of Concept (PoC) that deterministically reproduces the incident. TxRay self-checks postmortems by encoding incident-specific semantic oracles as executable assertions. To evaluate PoC correctness and quality, we develop PoCEvaluator, an independent agentic execution-and-review evaluator. On 114 incidents from DeFiHackLabs, TxRay produces an expert-aligned root cause and an executable PoC for 105 incidents, achieving 92.11% end-to-end reproduction. Under PoCEvaluator, 98.1% of TxRay PoCs avoid hard-coding attacker addresses, a +24.8pp lift over DeFiHackLabs. In a live deployment, TxRay delivers validated root causes in 40 minutes and PoCs in 59 minutes at median latency. TxRay's oracle-validated PoCs enable attack imitation, improving coverage by 15.6% and 65.5% over STING and APE.
CRJan 22, 2022
On How Zero-Knowledge Proof Blockchain Mixers Improve, and Worsen User PrivacyZhipeng Wang, Stefanos Chaliasos, Kaihua Qin et al.
Zero-knowledge proof (ZKP) mixers are one of the most widely-used blockchain privacy solutions, operating on top of smart contract-enabled blockchains. We find that ZKP mixers are tightly intertwined with the growing number of Decentralized Finance (DeFi) attacks and Blockchain Extractable Value (BEV) extractions. Through coin flow tracing, we discover that 205 blockchain attackers and 2,595 BEV extractors leverage mixers as their source of funds, while depositing a total attack revenue of 412.87M USD. Moreover, the US OFAC sanctions against the largest ZKP mixer, Tornado.Cash, have reduced the mixer's daily deposits by more than 80%. Further, ZKP mixers advertise their level of privacy through a so-called anonymity set size, which similarly to k-anonymity allows a user to hide among a set of k other users. Through empirical measurements, we, however, find that these anonymity set claims are mostly inaccurate. For the most popular mixers on Ethereum (ETH) and Binance Smart Chain (BSC), we show how to reduce the anonymity set size on average by 27.34% and 46.02% respectively. Our empirical evidence is also the first to suggest a differing privacy-predilection of users on ETH and BSC. State-of-the-art ZKP mixers are moreover interwoven with the DeFi ecosystem by offering anonymity mining (AM) incentives, i.e., users receive monetary rewards for mixing coins. However, contrary to the claims of related work, we find that AM does not necessarily improve the quality of a mixer's anonymity set. Our findings indicate that AM attracts privacy-ignorant users, who then do not contribute to improving the privacy of other mixer users.
GNJun 15, 2021
CeFi vs. DeFi -- Comparing Centralized to Decentralized FinanceKaihua Qin, Liyi Zhou, Yaroslav Afonin et al.
To non-experts, the traditional Centralized Finance (CeFi) ecosystem may seem obscure, because users are typically not aware of the underlying rules or agreements of financial assets and products. Decentralized Finance (DeFi), however, is making its debut as an ecosystem claiming to offer transparency and control, which are partially attributable to the underlying integrity-protected blockchain, as well as currently higher financial asset yields than CeFi. Yet, the boundaries between CeFi and DeFi may not be always so clear cut. In this work, we systematically analyze the differences between CeFi and DeFi, covering legal, economic, security, privacy and market manipulation. We provide a structured methodology to differentiate between a CeFi and a DeFi service. Our findings show that certain DeFi assets (such as USDC or USDT stablecoins) do not necessarily classify as DeFi assets, and may endanger the economic security of intertwined DeFi protocols. We conclude this work with the exploration of possible synergies between CeFi and DeFi.
CRJun 14, 2021
A2MM: Mitigating Frontrunning, Transaction Reordering and Consensus Instability in Decentralized ExchangesLiyi Zhou, Kaihua Qin, Arthur Gervais
The asset trading volume on blockchain-based exchanges (DEX) increased substantially since the advent of Automated Market Makers (AMM). Yet, AMMs and their forks compete on the same blockchain, incurring unnecessary network and block-space overhead, by attracting sandwich attackers and arbitrage competitions. Moreover, conceptually speaking, a blockchain is one database, and we find little reason to partition this database into multiple competing exchanges, which then necessarily require price synchronization through arbitrage. This paper shows that DEX arbitrage and trade routing among similar AMMs can be performed efficiently and atomically on-chain within smart contracts. These insights lead us to create a new AMM design, an Automated Arbitrage Market Maker, short A2MM DEX. A2MM aims to unite multiple AMMs to reduce overheads, costs and increase blockchain security. With respect to Miner Extractable Value (MEV), A2MM serves as a decentralized design for users to atomically collect MEV, mitigating the dangers of centralized MEV relay services. We show that A2MM offers essential security benefits. First, A2MM strengthens the blockchain consensus security by mitigating the competitive exploitation of MEV, therefore reducing the risks of consensus forks. A2MM reduces the network layer overhead of competitive transactions, improves network propagation, leading to less stale blocks and better blockchain security. Through trade routing, A2MM reduces the predatory risks of sandwich attacks by taking advantage of the minimum profitable victim input. A2MM also offers financial benefits to traders. Failed swap transactions from competitive trading occupy valuable block space, implying an upward pressure on transaction fees. Our evaluations shows that A2MM frees up 32.8% block-space of AMM-related transactions. In expectation, A2MM's revenue allows to reduce swap fees by 90%.
GNJun 11, 2021
An Empirical Study of DeFi Liquidations: Incentives, Risks, and InstabilitiesKaihua Qin, Liyi Zhou, Pablo Gamito et al.
Financial speculators often seek to increase their potential gains with leverage. Debt is a popular form of leverage, and with over 39.88B USD of total value locked (TVL), the Decentralized Finance (DeFi) lending markets are thriving. Debts, however, entail the risks of liquidation, the process of selling the debt collateral at a discount to liquidators. Nevertheless, few quantitative insights are known about the existing liquidation mechanisms. In this paper, to the best of our knowledge, we are the first to study the breadth of the borrowing and lending markets of the Ethereum DeFi ecosystem. We focus on Aave, Compound, MakerDAO, and dYdX, which collectively represent over 85% of the lending market on Ethereum. Given extensive liquidation data measurements and insights, we systematize the prevalent liquidation mechanisms and are the first to provide a methodology to compare them objectively. We find that the existing liquidation designs well incentivize liquidators but sell excessive amounts of discounted collateral at the borrowers' expenses. We measure various risks that liquidation participants are exposed to and quantify the instabilities of existing lending protocols. Moreover, we propose an optimal strategy that allows liquidators to increase their liquidation profit, which may aggravate the loss of borrowers.
CRMar 3, 2021
On the Just-In-Time Discovery of Profit-Generating Transactions in DeFi ProtocolsLiyi Zhou, Kaihua Qin, Antoine Cully et al.
In this paper, we investigate two methods that allow us to automatically create profitable DeFi trades, one well-suited to arbitrage and the other applicable to more complicated settings. We first adopt the Bellman-Ford-Moore algorithm with DEFIPOSER-ARB and then create logical DeFi protocol models for a theorem prover in DEFIPOSER-SMT. While DEFIPOSER-ARB focuses on DeFi transactions that form a cycle and performs very well for arbitrage, DEFIPOSER-SMT can detect more complicated profitable transactions. We estimate that DEFIPOSER-ARB and DEFIPOSER-SMT can generate an average weekly revenue of 191.48ETH (76,592USD) and 72.44ETH (28,976USD) respectively, with the highest transaction revenue being 81.31ETH(32,524USD) and22.40ETH (8,960USD) respectively. We further show that DEFIPOSER-SMT finds the known economic bZx attack from February 2020, which yields 0.48M USD. Our forensic investigations show that this opportunity existed for 69 days and could have yielded more revenue if exploited one day earlier. Our evaluation spans 150 days, given 96 DeFi protocol actions, and 25 assets. Looking beyond the financial gains mentioned above, forks deteriorate the blockchain consensus security, as they increase the risks of double-spending and selfish mining. We explore the implications of DEFIPOSER-ARB and DEFIPOSER-SMT on blockchain consensus. Specifically, we show that the trades identified by our tools exceed the Ethereum block reward by up to 874x. Given optimal adversarial strategies provided by a Markov Decision Process (MDP), we quantify the value threshold at which a profitable transaction qualifies as Miner ExtractableValue (MEV) and would incentivize MEV-aware miners to fork the blockchain. For instance, we find that on Ethereum, a miner with a hash rate of 10% would fork the blockchain if an MEV opportunity exceeds 4x the block reward.
CRJan 14, 2021
Quantifying Blockchain Extractable Value: How dark is the forest?Kaihua Qin, Liyi Zhou, Arthur Gervais
Permissionless blockchains such as Bitcoin have excelled at financial services. Yet, opportunistic traders extract monetary value from the mesh of decentralized finance (DeFi) smart contracts through so-called blockchain extractable value (BEV). The recent emergence of centralized BEV relayer portrays BEV as a positive additional revenue source. Because BEV was quantitatively shown to deteriorate the blockchain's consensus security, BEV relayers endanger the ledger security by incentivizing rational miners to fork the chain. For example, a rational miner with a 10% hashrate will fork Ethereum if a BEV opportunity exceeds 4x the block reward. However, related work is currently missing quantitative insights on past BEV extraction to assess the practical risks of BEV objectively. In this work, we allow to quantify the BEV danger by deriving the USD extracted from sandwich attacks, liquidations, and decentralized exchange arbitrage. We estimate that over 32 months, BEV yielded 540.54M USD in profit, divided among 11,289 addresses when capturing 49,691 cryptocurrencies and 60,830 on-chain markets. The highest BEV instance we find amounts to 4.1M USD, 616.6x the Ethereum block reward. Moreover, while the practitioner's community has discussed the existence of generalized trading bots, we are, to our knowledge, the first to provide a concrete algorithm. Our algorithm can replace unconfirmed transactions without the need to understand the victim transactions' underlying logic, which we estimate to have yielded a profit of 57,037.32 ETH (35.37M USD) over 32 months of past blockchain data. Finally, we formalize and analyze emerging BEV relay systems, where miners accept BEV transactions from a centralized relay server instead of the peer-to-peer (P2P) network. We find that such relay systems aggravate the consensus layer attacks and therefore further endanger blockchain security.
CRSep 29, 2020
High-Frequency Trading on Decentralized On-Chain ExchangesLiyi Zhou, Kaihua Qin, Christof Ferreira Torres et al.
Decentralized exchanges (DEXs) allow parties to participate in financial markets while retaining full custody of their funds. However, the transparency of blockchain-based DEX in combination with the latency for transactions to be processed, makes market-manipulation feasible. For instance, adversaries could perform front-running -- the practice of exploiting (typically non-public) information that may change the price of an asset for financial gain. In this work we formalize, analytically exposit and empirically evaluate an augmented variant of front-running: sandwich attacks, which involve front- and back-running victim transactions on a blockchain-based DEX. We quantify the probability of an adversarial trader being able to undertake the attack, based on the relative positioning of a transaction within a blockchain block. We find that a single adversarial trader can earn a daily revenue of over several thousand USD when performing sandwich attacks on one particular DEX -- Uniswap, an exchange with over 5M USD daily trading volume by June 2020. In addition to a single-adversary game, we simulate the outcome of sandwich attacks under multiple competing adversaries, to account for the real-world trading environment.
CRAug 26, 2020
FileBounty: Fair Data ExchangeSimon Janin, Kaihua Qin, Akaki Mamageishvili et al.
Digital contents are typically sold online through centralized and custodian marketplaces, which requires the trading partners to trust a central entity. We present FileBounty, a fair protocol which, assuming the cryptographic hash of the file of interest is known to the buyer, is trust-free and lets a buyer purchase data for a previously agreed monetary amount, while guaranteeing the integrity of the contents. To prevent misbehavior, FileBounty guarantees that any deviation from the expected participants' behavior results in a negative financial payoff; i.e. we show that honest behavior corresponds to a subgame perfect Nash equilibrium. Our novel deposit refunding scheme is resistant to extortion attacks under rational adversaries. If buyer and seller behave honestly, FileBounty's execution requires only three on-chain transactions, while the actual data is exchanged off-chain in an efficient and privacy-preserving manner. We moreover show how FileBounty enables a flexible peer-to-peer setting where multiple parties fairly sell a file to a buyer.
CRAug 26, 2020
Applying Private Information Retrieval to Lightweight Bitcoin ClientsKaihua Qin, Henryk Hadass, Arthur Gervais et al.
Lightweight Bitcoin clients execute a Simple Payment Verification (SPV) protocol to verify the validity of transactions related to a particular user. Currently, lightweight clients use Bloom filters to significantly reduce the amount of bandwidth required to validate a particular transaction. This is despite the fact that research has shown that Bloom filters are insufficient at preserving the privacy of clients' queries. In this paper we describe our design of an SPV protocol that leverages Private Information Retrieval (PIR) to create fully private and performant queries. We show that our protocol has a low bandwidth and latency cost; properties that make our protocol a viable alternative for lightweight Bitcoin clients and other cryptocurrencies with a similar SPV model. In contract to Bloom filters, our PIR-based approach offers deterministic privacy to the user. Among our results, we show that in the worst case, clients who would like to verify 100 transactions occurring in the past week incurs a bandwidth cost of 33.54 MB with an associated latency of approximately 4.8 minutes, when using our protocol. The same query executed using the Bloom-filter-based SPV protocol incurs a bandwidth cost of 12.85 MB; this is a modest overhead considering the privacy guarantees it provides.
CRMar 8, 2020
Attacking the DeFi Ecosystem with Flash Loans for Fun and ProfitKaihua Qin, Liyi Zhou, Benjamin Livshits et al.
Credit allows a lender to loan out surplus capital to a borrower. In the traditional economy, credit bears the risk that the borrower may default on its debt, the lender hence requires upfront collateral from the borrower, plus interest fee payments. Due to the atomicity of blockchain transactions, lenders can offer flash loans, i.e., loans that are only valid within one transaction and must be repaid by the end of that transaction. This concept has lead to a number of interesting attack possibilities, some of which were exploited in February 2020. This paper is the first to explore the implication of transaction atomicity and flash loans for the nascent decentralized finance (DeFi) ecosystem. We show quantitatively how transaction atomicity increases the arbitrage revenue. We moreover analyze two existing attacks with ROIs beyond 500k%. We formulate finding the attack parameters as an optimization problem over the state of the underlying Ethereum blockchain and the state of the DeFi ecosystem. We show how malicious adversaries can efficiently maximize an attack profit and hence damage the DeFi ecosystem further. Specifically, we present how two previously executed attacks can be "boosted" to result in a profit of 829.5k USD and 1.1M USD, respectively, which is a boost of 2.37x and 1.73x, respectively.