Yuedong Xu

CR
h-index3
12papers
250citations
Novelty51%
AI Score47

12 Papers

MMJul 29, 2024
AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics

Xiangxiang Dai, Zeyu Zhang, Peng Yang et al. · uw

The rapid evolution of multimedia and computer vision technologies requires adaptive visual model deployment strategies to effectively handle diverse tasks and varying environments. This work introduces AxiomVision, a novel framework that can guarantee accuracy by leveraging edge computing to dynamically select the most efficient visual models for video analytics under diverse scenarios. Utilizing a tiered edge-cloud architecture, AxiomVision enables the deployment of a broad spectrum of visual models, from lightweight to complex DNNs, that can be tailored to specific scenarios while considering camera source impacts. In addition, AxiomVision provides three core innovations: (1) a dynamic visual model selection mechanism utilizing continual online learning, (2) an efficient online method that efficiently takes into account the influence of the camera's perspective, and (3) a topology-driven grouping approach that accelerates the model selection process. With rigorous theoretical guarantees, these advancements provide a scalable and effective solution for visual tasks inherent to multimedia systems, such as object detection, classification, and counting. Empirically, AxiomVision achieves a 25.7\% improvement in accuracy.

LGFeb 28
Pragma-VL: Towards a Pragmatic Arbitration of Safety and Helpfulness in MLLMs

Ming Wen, Kun Yang, Xin Chen et al.

Multimodal Large Language Models (MLLMs) pose critical safety challenges, as they are susceptible not only to adversarial attacks such as jailbreaking but also to inadvertently generating harmful content for benign users. While internal safety alignment via Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) is a primary mitigation strategy, current methods often face a safety-utility trade-off: they either refuse benign queries out of excessive caution or overlook latent risks in cross-modal interactions. To resolve this, we introduce Pragma-VL, an end-to-end alignment algorithm that enables MLLMs to pragmatically arbitrate between safety and helpfulness. First, we enhance visual risk perception with a novel cold-start SFT stage. This is achieved by applying risk-aware clustering to the visual encoder and using an interleaved dataset of risk descriptions and high-quality data. Second, we introduce a theoretically-guaranteed reward model that leverages synergistic learning. We train it with a novel data augmentation method that assigns dynamic weights based on the queries, enabling contextual arbitration between safety and helpfulness. Extensive experiments show that Pragma-VL effectively balances safety and helpfulness, outperforming baselines by 5% to 20% on most multimodal safety benchmarks while preserving its general capabilities in areas such as mathematics and knowledge reasoning.

DCDec 24, 2025
Deadline-Aware Online Scheduling for LLM Fine-Tuning with Spot Market Predictions

Linggao Kong, Yuedong Xu, Lei Jiao et al.

As foundation models grow in size, fine-tuning them becomes increasingly expensive. While GPU spot instances offer a low-cost alternative to on-demand resources, their volatile prices and availability make deadline-aware scheduling particularly challenging. We tackle this difficulty by using a mix of spot and on-demand instances. Distinctively, we show the predictability of prices and availability in a spot instance market, the power of prediction in enabling cost-efficient scheduling and its sensitivity to estimation errors. An integer programming problem is formulated to capture the use of mixed instances under both the price and availability dynamics. We propose an online allocation algorithm with prediction based on the committed horizon control approach that leverages a \emph{commitment level} to enforce the partial sequence of decisions. When this prediction becomes inaccurate, we further present a complementary online algorithm without predictions. An online policy selection algorithm is developed that learns the best policy from a pool constructed by varying the parameters of both algorithms. We prove that the prediction-based algorithm achieves tighter performance bounds as prediction error decreases, while the policy selection algorithm possesses a regret bound of $\mathcal{O}(\sqrt{T})$. Experimental results demonstrate that our online framework can adaptively select the best policy under varying spot market dynamics and prediction quality, consistently outperforming baselines and improving utility by up to 54.8\%.

CRJul 14, 2025
Differentially Private Federated Low Rank Adaptation Beyond Fixed-Matrix

Ming Wen, Jiaqi Zhu, Yuedong Xu et al.

Large language models (LLMs) typically require fine-tuning for domain-specific tasks, and LoRA offers a computationally efficient approach by training low-rank adapters. LoRA is also communication-efficient for federated LLMs when multiple users collaboratively fine-tune a global LLM model without sharing their proprietary raw data. However, even the transmission of local adapters between a server and clients risks serious privacy leakage. Applying differential privacy (DP) to federated LoRA encounters a dilemma: adding noise to both adapters amplifies synthetic noise on the model, while fixing one adapter impairs the learnability of fine-tuning. In this paper, we propose FedASK (Differentially Private Federated Low Rank Adaptation with Double Sketching) , a novel federated LoRA framework to enable effective updating of both low-rank adapters with robust differential privacy. Inspired by randomized SVD, our key idea is a two-stage sketching pipeline. This pipeline first aggregates carefully sketched, privacy-preserving local updates, and then reconstructs the global matrices on the server to facilitate effective updating of both adapters. We theoretically prove FedASK's differential privacy guarantee and its exact aggregation property. Comprehensive experiments demonstrate that FedASK consistently outperforms baseline methods across a variety of privacy settings and data distributions.

LGOct 18, 2024
A Communication and Computation Efficient Fully First-order Method for Decentralized Bilevel Optimization

Min Wen, Chengchang Liu, Ahmed Abdelmoniem et al.

Bilevel optimization, crucial for hyperparameter tuning, meta-learning and reinforcement learning, remains less explored in the decentralized learning paradigm, such as decentralized federated learning (DFL). Typically, decentralized bilevel methods rely on both gradients and Hessian matrices to approximate hypergradients of upper-level models. However, acquiring and sharing the second-order oracle is compute and communication intensive. % and sharing this information incurs heavy communication overhead. To overcome these challenges, this paper introduces a fully first-order decentralized method for decentralized Bilevel optimization, $\text{C}^2$DFB which is both compute- and communicate-efficient. In $\text{C}^2$DFB, each learning node optimizes a min-min-max problem to approximate hypergradient by exclusively using gradients information. To reduce the traffic load at the inner-loop of solving the lower-level problem, $\text{C}^2$DFB incorporates a lightweight communication protocol for efficiently transmitting compressed residuals of local parameters. % during the inner loops. Rigorous theoretical analysis ensures its convergence % of the algorithm, indicating a first-order oracle calls of $\tilde{\mathcal{O}}(ε^{-4})$. Experiments on hyperparameter tuning and hyper-representation tasks validate the superiority of $\text{C}^2$DFB across various typologies and heterogeneous data distributions.

CRDec 20, 2021
Blockchain Mining with Multiple Selfish Miners

Qianlan Bai, Yuedong Xu, Nianyi Liu et al.

This paper studies a fundamental problem regarding the security of blockchain PoW consensus on how the existence of multiple misbehaving miners influences the profitability of selfish mining. Each selfish miner (or attacker interchangeably) maintains a private chain and makes it public opportunistically for acquiring more rewards incommensurate to his Hash power. We first establish a general Markov chain model to characterize the state transition of public and private chains for Basic Selfish Mining (BSM), and derive the stationary profitable threshold of Hash power in closed-form. It reduces from 25% for a single attacker to below 21.48% for two symmetric attackers theoretically, and further reduces to around 10% with eight symmetric attackers experimentally. We next explore the profitable threshold when one of the attackers performs strategic mining based on Partially Observable Markov Decision Process (POMDP) that only half of the attributes pertinent to a mining state are observable to him. An online algorithm is presented to compute the nearly optimal policy efficiently despite the large state space and high dimensional belief space. The strategic attacker mines selfishly and more agilely than BSM attacker when his Hash power is relatively high, and mines honestly otherwise, thus leading to a much lower profitable threshold. Last, we formulate a simple model of absolute mining revenue that yields an interesting observation: selfish mining is never profitable at the first difficulty adjustment period, but replying on the reimbursement of stationary selfish mining gains in the future periods. The delay till being profitable of an attacker increases with the decrease of his Hash power, making blockchain miners more cautious on performing selfish mining.

LGDec 20, 2021
Decentralized Stochastic Proximal Gradient Descent with Variance Reduction over Time-varying Networks

Xuanjie Li, Yuedong Xu, Jessie Hui Wang et al.

In decentralized learning, a network of nodes cooperate to minimize an overall objective function that is usually the finite-sum of their local objectives, and incorporates a non-smooth regularization term for the better generalization ability. Decentralized stochastic proximal gradient (DSPG) method is commonly used to train this type of learning models, while the convergence rate is retarded by the variance of stochastic gradients. In this paper, we propose a novel algorithm, namely DPSVRG, to accelerate the decentralized training by leveraging the variance reduction technique. The basic idea is to introduce an estimator in each node, which tracks the local full gradient periodically, to correct the stochastic gradient at each iteration. By transforming our decentralized algorithm into a centralized inexact proximal gradient algorithm with variance reduction, and controlling the bounds of error sequences, we prove that DPSVRG converges at the rate of $O(1/T)$ for general convex objectives plus a non-smooth term with $T$ as the number of iterations, while DSPG converges at the rate $O(\frac{1}{\sqrt{T}})$. Our experiments on different applications, network topologies and learning models demonstrate that DPSVRG converges much faster than DSPG, and the loss function of DPSVRG decreases smoothly along with the training epochs.

CRMay 21, 2021
SCSGuard: Deep Scam Detection for Ethereum Smart Contracts

Huiwen Hu, Yuedong Xu

Smart contract is the building block of blockchain systems that enables automated peer-to-peer transactions and decentralized services. With the increasing popularity of smart contracts, blockchain systems, in particular Ethereum, have been the "paradise" of versatile fraud activities in which Ponzi, Honeypot and Phishing are the prominent ones. Formal verification and symbolic analysis have been employed to combat these destructive scams by analyzing the codes and function calls, yet the vulnerability of each \emph{individual} scam should be predefined discreetly. In this work, we present SCSGuard, a novel deep learning scam detection framework that harnesses the automatically extractable bytecodes of smart contracts as their new features. We design a GRU network with attention mechanism to learn from the \emph{N-gram bytecode} patterns, and determines whether a smart contract is fraudulent or not. Our framework is advantageous over the baseline algorithms in three aspects. Firstly, SCSGuard provides a unified solution to different scam genres, thus relieving the need of code analysis skills. Secondly, the inference of SCSGuard is faster than the code analysis by several order of magnitudes. Thirdly, experimental results manifest that SCSGuard achieves high accuracy (0.92$\sim$0.94), precision (0.94$\sim$0.96\%) and recall (0.97$\sim$0.98) for both Ponzi and Honeypot scams under similar settings, and is potentially useful to detect new Phishing smart contracts.

SIJan 15, 2020
Evolution of Ethereum: A Temporal Graph Perspective

Qianlan Bai, Chao Zhang, Yuedong Xu et al.

Ethereum is one of the most popular blockchain systems that supports more than half a million transactions every day and fosters miscellaneous decentralized applications with its Turing-complete smart contract machine. Whereas it remains mysterious what the transaction pattern of Ethereum is and how it evolves over time. In this paper, we study the evolutionary behavior of Ethereum transactions from a temporal graph point of view. We first develop a data analytics platform to collect external transactions associated with users as well as internal transactions initiated by smart contracts. Three types of temporal graphs, user-to-user, contract-to-contract and user-contract graphs, are constructed according to trading relationship and are segmented with an appropriate time window. We observe a strong correlation between the size of user-to-user transaction graph and the average Ether price in a time window, while no evidence of such linkage is shown at the average degree, average edge weights and average triplet closure duration. The macroscopic and microscopic burstiness of Ethereum transactions is validated. We analyze the Gini indexes of the transaction graphs and the user wealth in which Ethereum is found to be very unfair since the very beginning, in a sense, "the rich is already very rich".

NIMay 9, 2019
Toward Packet Routing with Fully-distributed Multi-agent Deep Reinforcement Learning

Xinyu You, Xuanjie Li, Yuedong Xu et al.

Packet routing is one of the fundamental problems in computer networks in which a router determines the next-hop of each packet in the queue to get it as quickly as possible to its destination. Reinforcement learning (RL) has been introduced to design autonomous packet routing policies with local information of stochastic packet arrival and service. However, the curse of dimensionality of RL prohibits the more comprehensive representation of dynamic network states, thus limiting its potential benefit. In this paper, we propose a novel packet routing framework based on \emph{multi-agent} deep reinforcement learning (DRL) in which each router possess an \emph{independent} LSTM recurrent neural network for training and decision making in a \emph{fully distributed} environment. The LSTM recurrent neural network extracts routing features from rich information regarding backlogged packets and past actions, and effectively approximates the value function of Q-learning. We further allow each route to communicate periodically with direct neighbors so that a broader view of network state can be incorporated. Experimental results manifest that our multi-agent DRL policy can strike the delicate balance between congestion-aware and shortest routes, and significantly reduce the packet delivery time in general network topologies compared with its counterparts.

CRNov 1, 2018
A Deep Dive into Blockchain Selfish Mining

Qianlan Bai, Xinyan Zhou, Xing Wang et al.

This paper studies a fundamental problem regarding the security of blockchain on how the existence of multiple misbehaving pools influences the profitability of selfish mining. Each selfish miner maintains a private chain and makes it public opportunistically for the purpose of acquiring more rewards incommensurate to his Hashrate. We establish a novel Markov chain model to characterize all the state transitions of public and private chains. The minimum requirement of Hashrate together with the minimum delay of being profitable is derived in close-form. The former reduces to 21.48% with the symmetric selfish miners, while their competition with asymmetric Hashrates puts forward a higher requirement of the profitable threshold. The profitable delay increases with the decrease of the Hashrate of selfish miners, making the mining pools more cautious on performing selfish mining.

DCNov 20, 2017
Deep Reinforcement Learning for Multi-Resource Multi-Machine Job Scheduling

Weijia Chen, Yuedong Xu, Xiaofeng Wu

Minimizing job scheduling time is a fundamental issue in data center networks that has been extensively studied in recent years. The incoming jobs require different CPU and memory units, and span different number of time slots. The traditional solution is to design efficient heuristic algorithms with performance guarantee under certain assumptions. In this paper, we improve a recently proposed job scheduling algorithm using deep reinforcement learning and extend it to multiple server clusters. Our study reveals that deep reinforcement learning method has the potential to outperform traditional resource allocation algorithms in a variety of complicated environments.