Yibin Xu

CL
h-index7
10papers
49citations
Novelty54%
AI Score40

10 Papers

CLMar 14, 2025Code
DeskVision: Large Scale Desktop Region Captioning for Advanced GUI Agents

Yibin Xu, Liang Yang, Hao Chen et al.

The limitation of graphical user interface (GUI) data has been a significant barrier to the development of GUI agents today, especially for the desktop / computer use scenarios. To address this, we propose an automated GUI data generation pipeline, AutoCaptioner, which generates data with rich descriptions while minimizing human effort. Using AutoCaptioner, we created a novel large-scale desktop GUI dataset, DeskVision, along with the largest desktop test benchmark, DeskVision-Eval, which reflects daily usage and covers diverse systems and UI elements, each with rich descriptions. With DeskVision, we train a new GUI understanding model, GUIExplorer. Results show that GUIExplorer achieves state-of-the-art (SOTA) performance in understanding/grounding visual elements without the need for complex architectural designs. We further validated the effectiveness of the DeskVision dataset through ablation studies on various large visual language models (LVLMs). We believe that AutoCaptioner and DeskVision will significantly advance the development of GUI agents, and will open-source them for the community.

CLSep 20, 2025Code
LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts

Junhao Chen, Jingbo Sun, Xiang Li et al.

As large language models (LLMs) advance across diverse tasks, the need for comprehensive evaluation beyond single metrics becomes increasingly important. To fully assess LLM intelligence, it is crucial to examine their interactive dynamics and strategic behaviors. We present LLMsPark, a game theory-based evaluation platform that measures LLMs' decision-making strategies and social behaviors in classic game-theoretic settings, providing a multi-agent environment to explore strategic depth. Our system cross-evaluates 15 leading LLMs (both commercial and open-source) using leaderboard rankings and scoring mechanisms. Higher scores reflect stronger reasoning and strategic capabilities, revealing distinct behavioral patterns and performance differences across models. This work introduces a novel perspective for evaluating LLMs' strategic intelligence, enriching existing benchmarks and broadening their assessment in interactive, game-theoretic scenarios. The benchmark and rankings are publicly available at https://llmsparks.github.io/.

LGApr 18, 2024
Privacy-Preserving UCB Decision Process Verification via zk-SNARKs

Xikun Jiang, He Lyu, Chenhao Ying et al.

With the increasingly widespread application of machine learning, how to strike a balance between protecting the privacy of data and algorithm parameters and ensuring the verifiability of machine learning has always been a challenge. This study explores the intersection of reinforcement learning and data privacy, specifically addressing the Multi-Armed Bandit (MAB) problem with the Upper Confidence Bound (UCB) algorithm. We introduce zkUCB, an innovative algorithm that employs the Zero-Knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARKs) to enhance UCB. zkUCB is carefully designed to safeguard the confidentiality of training data and algorithmic parameters, ensuring transparent UCB decision-making. Experiments highlight zkUCB's superior performance, attributing its enhanced reward to judicious quantization bit usage that reduces information entropy in the decision-making process. zkUCB's proof size and verification time scale linearly with the execution steps of zkUCB. This showcases zkUCB's adept balance between data security and operational efficiency. This approach contributes significantly to the ongoing discourse on reinforcing data privacy in complex decision-making processes, offering a promising solution for privacy-sensitive applications.

CLJun 1, 2021
THG: Transformer with Hyperbolic Geometry

Zhe Liu, Yibin Xu

Transformer model architectures have become an indispensable staple in deep learning lately for their effectiveness across a range of tasks. Recently, a surge of "X-former" models have been proposed which improve upon the original Transformer architecture. However, most of these variants make changes only around the quadratic time and memory complexity of self-attention, i.e. the dot product between the query and the key. What's more, they are calculate solely in Euclidean space. In this work, we propose a novel Transformer with Hyperbolic Geometry (THG) model, which take the advantage of both Euclidean space and Hyperbolic space. THG makes improvements in linear transformations of self-attention, which are applied on the input sequence to get the query and the key, with the proposed hyperbolic linear. Extensive experiments on sequence labeling task, machine reading comprehension task and classification task demonstrate the effectiveness and generalizability of our model. It also demonstrates THG could alleviate overfitting.

DCApr 9, 2020
A $p/2$ Adversary Power Resistant Blockchain Sharding Approach

Yibin Xu, Jianhua Shao, Yangyu Huang et al.

Blockchain Sharding is a blockchain performance enhancement approach. By splitting a blockchain into several parallel-run committees (shards), it helps increase transaction throughput, reduce computational resources required, and increase reward expectation for participants. Recently, several flexible sharding methods that can tolerate up to $n/2$ Byzantine nodes ($n/2$ security level) have been proposed. However, these methods suffer from three main drawbacks. First, in a non-sharding blockchain, nodes can have different weight (power or stake) to create a consensus, and as such an adversary needs to control half of the overall weight in order to manipulate the system ($p/2$ security level). In blockchain sharding, all nodes carry the same weight. Thus, it is only under the assumption that honest participants create as many nodes as they should that a $n/2$ security level blockchain sharding reaches the $p/2$ security level. Second, when some nodes leave the system, other nodes need to be reassigned, frequently, from shard to shard in order to maintain the security level. This has an adverse effect on system performance. Third, while some $n/2$ approaches can maintain data integrity with up to $n/2$ Byzantine nodes, their systems can halt with a smaller number of Byzantine nodes. In this paper, we present a $p/2$ security level blockchain sharding approach that does not require honest participants to create multiple nodes, requires less node reassignment when some nodes leave the system, and can prevent the system from halting. Our experiments show that our new approach outperforms existing blockchain sharding approaches in terms of security, transaction throughput and flexibility.

DCMar 16, 2020
A Flexible n/2 Adversary Node Resistant and Halting Recoverable Blockchain Sharding Protocol

Yibin Xu, Yangyu Huang, Jianhua Shao et al.

Blockchain sharding is a promising approach to solving the dilemma between decentralisation and high performance (transaction throughput) for blockchain. The main challenge of Blockchain sharding systems is how to reach a decision on a statement among a sub-group (shard) of people while ensuring the whole population recognises this statement. Namely, the challenge is to prevent an adversary who does not have the majority of nodes globally but have the majority of nodes inside a shard. Most Blockchain sharding approaches can only reach a correct consensus inside a shard with at most $n/3$ evil nodes in a $n$ node system. There is a blockchain sharding approach which can prevent an incorrect decision to be reached when the adversary does not have $n/2$ nodes globally. However, the system can be stopped from reaching consensus (become deadlocked) if the adversary controls a smaller number of nodes. In this paper, we present an improved Blockchain sharding approach that can withstand $n/2$ adversarial nodes and recover from deadlocks. The recovery is made by dynamically adjusting the number of shards and the shard size. A performance analysis suggests our approach has a high performance (transaction throughput) while requiring little bandwidth for synchronisation.

CRJan 20, 2020
Segment blockchain: A size reduced storage mechanism for blockchain

Yibin Xu, Yangyu Huang

The exponential growth of the blockchain size has become a major contributing factor that hinders the decentralisation of blockchain and its potential implementations in data-heavy applications. In this paper, we propose segment blockchain, an approach that segmentises blockchain and enables nodes to only store a copy of one blockchain segment. We use \emph{PoW} as a membership threshold to limit the number of nodes taken by an Adversary---the Adversary can only gain at most $n/2$ of nodes in a network of $n$ nodes when it has $50\%$ of the calculation power in the system (the Nakamoto blockchain security threshold). A segment blockchain system fails when an Adversary stores all copies of a segment, because the Adversary can then leave the system, causing a permanent loss of the segment. We theoretically prove that segment blockchain can sustain a $(AD/n)^m$ failure probability when the Adversary has no more than $AD$ number of nodes and every segment is stored by $m$ number of nodes. The storage requirement is mostly shrunken compared to the traditional design and therefore making the blockchain more suitable for data-heavy applications.

DCJan 20, 2020
Contract-connection:An efficient communication protocol for Distributed Ledger Technology

Yibin Xu, Yangyu Huang

Distributed Ledger Technology (DLT) is promising to become the foundation of many decentralised systems. However, the unbalanced and unregulated network layout contributes to the inefficiency of DLT especially in the Internet of Things (IoT) environments, where nodes connect to only a limited number of peers. The data communication speed globally is unbalanced and does not live up to the constraints of efficient real-time distributed systems. In this paper, we introduce a new communication protocol, which enables nodes to calculate the tradeoff between connecting/disconnecting a peer in a completely decentralised manner. The network layout globally is continuously re-balancing and optimising along with nodes adjusting their peers. This communication protocol weakened the inequality of the communication network. The experiment suggests this communication protocol is stable and efficient.

CRJan 19, 2020
Anchoring the value of Cryptocurrency

Yibin Xu, Yangyu Huang, Jianhua Shao

A decade long thrive of cryptocurrency has shown its potential as a source of alternative-finance and the security and the robustness of the underpinning blockchain technology. However, most cryptocurrencies fail to show inimitability and their meanings in the real world. As a result, they usually start off as favourites but quickly become the outcasts of the digital asset market. The blockchain society attempts to anchor the value of cryptocurrency with real values by employing smart contracts and link it with computation resources and the digital-productivity that have value and demands in the real world. But their attempts have some undesirable effects due to a limited number of practical applications. This limitation is caused by the dilemma between high performance and decentralisation (universal joinability). The emerging of blockchain sharding models, however, has offered a possible solution to address this dilemma. In this paper, we explore a financial model for blockchain sharding that will build an active link between the value of cryptocurrency and computation resources as well as the market and labour behaviours. Our model can adjust the price of resources and the compensation for maintaining a system based on those behaviours. We anchor the value of cryptocurrency by the amount of computation resources participated in and give the cryptocurrency a meaning as the exchange between computation resources globally. Finally, we present a working example which, through financial regularities, regulates the behaviour of anonymous participants, also incents/discourages participation dynamically.

CRJan 15, 2020
An n/2 Byzantine node tolerate Blockchain Sharding approach

Yibin Xu, Yangyu Huang

Traditional Blockchain Sharding approaches can only tolerate up to n/3 of nodes being adversary because they rely on the hypergeometric distribution to make a failure (an adversary does not have n/3 of nodes globally but can manipulate the consensus of a Shard) hard to happen. The system must maintain a large Shard size (the number of nodes inside a Shard) to sustain the low failure probability so that only a small number of Shards may exist. In this paper, we present a new approach of Blockchain Sharding that can withstand up to n/2 of nodes being bad. We categorise the nodes into different classes, and every Shard has a fixed number of nodes from different classes. We prove that this design is much more secure than the traditional models (only have one class) and the Shard size can be reduced significantly. In this way, many more Shards can exist, and the transaction throughput can be largely increased. The improved Blockchain Sharding approach is promising to serve as the foundation for decentralised autonomous organisations and decentralised database.