Where Does MEV Really Come From? Revisiting CEXDEX Arbitrage on Ethereum
For blockchain researchers and practitioners, this work provides a more accurate theoretical framework for understanding the origins of MEV, challenging previous underestimates and offering insights into the scale of arbitrage profits.
The paper revisits the theoretical model of CEX-DEX arbitrage on Ethereum, finding that prior Black-Scholes models underestimate arbitrage profits by ignoring price jumps. The authors propose an extended discrete-time AMM model with stochastic jumps, which closely matches empirical observations and explains the scale of MEV revenue.
A central question of the Ethereum ecosystem is where Maximal Extractable Value (MEV)revenue originates and to what extent it stems from harming unsuspecting users. It is acceptable if MEV arises from arbitrages between centralised and decentralised exchanges (CEX-DEX). Yet theoretical models have significantly underestimated the scale of these arbitrages, while empirical studies have highlighted their importance - though these remain conservative estimates, constrained by numerous debatable heuristic assumptions. Revisiting the theoretical model, we found that CEX-DEX arbitrages require trading volumes on the order of the total activity of major liquidity pools and yield profits comparable to MEV. Most prior AMM models utilised the Black-Scholes (BS) stochastic differential equation (SDE) - i.e., geometric Brownian motion - and assumed continuous price trajectories where asset prices move in small increments only.We argue that BS underestimates arbitrage profits by ignoring price jumps, which are precisely the points at which arbitrage opportunities tend to arise. To address this gap, we present an extended discrete-time AMM model in which the price process is the sum of a diffusive component and stochastic jumps that can have arbitrary noise distributions. Although mathematically more involved this framework allows us to employ a general discrete-time SDE and compute the stationary probability distribution via function iteration with geometric convergence. We further prove that the resulting mispricing process is an ergodic Markov chain. We implement our model in C++, collect spot prices and AMM exchange data from the Ethereum blockchain and fit the model parameters to the observed prices. The estimates derived from our model closely match empirical observations and provide a natural theoretical explanation for several fundamental questions in the blockchain ecosystem.