SYMar 12, 2012
Average Consensus on General Strongly Connected DigraphsKai Cai, Hideaki Ishii
We study the average consensus problem of multi-agent systems for general network topologies with unidirectional information flow. We propose two (linear) distributed algorithms, deterministic and gossip, respectively for the cases where the inter-agent communication is synchronous and asynchronous. Our contribution is that in both cases, the developed algorithms guarantee state averaging on arbitrary strongly connected digraphs; in particular, this graphical condition does not require that the network be balanced or symmetric, thereby extending many previous results in the literature. The key novelty of our approach is to augment an additional variable for each agent, called "surplus", whose function is to locally record individual state updates. For convergence analysis, we employ graph-theoretic and nonnegative matrix tools, with the eigenvalue perturbation theory playing a crucial role.
SYDec 31, 2018
On Scalable Supervisory Control of Multi-Agent Discrete-Event SystemsYingying Liu, Kai Cai, Zhiwu Li
In this paper we study multi-agent discrete-event systems where the agents can be divided into several groups, and within each group the agents have similar or identical state transition structures. We employ a relabeling map to generate a "template structure" for each group, and synthesize a scalable supervisor whose state size and computational process are independent of the number of agents. This scalability allows the supervisor to remain invariant (no recomputation or reconfiguration needed) if and when there are agents removed due to failure or added for increasing productivity. The constant computational effort for synthesizing the scalable supervisor also makes our method promising for handling large-scale multi-agent systems. Moreover, based on the scalable supervisor we design scalable local controllers, one for each component agent, to establish a purely distributed control architecture. Three examples are provided to illustrate our proposed scalable supervisory synthesis and the resulting scalable supervisors as well as local controllers.
SYSep 1, 2014
Distributed Supervisory Control of Discrete-Event Systems with Communication DelayRenyuan Zhang, Kai Cai, Yongmei Gan et al.
This paper identifies a property of delay-robustness in distributed supervisory control of discrete-event systems (DES) with communication delays. In previous work a distributed supervisory control problem has been investigated on the assumption that inter-agent communications take place with negligible delay. From an applications viewpoint it is desirable to relax this constraint and identify communicating distributed controllers which are delay-robust, namely logically equivalent to their delay-free counterparts. For this we introduce inter-agent channels modeled as 2-state automata, compute the overall system behavior, and present an effective computational test for delay-robustness. From the test it typically results that the given delay-free distributed control is delay-robust with respect to certain communicated events, but not for all, thus distinguishing events which are not delay-critical from those that are. The approach is illustrated by a workcell model with three communicating agents.
SYJun 19, 2018
Supervisor Localization of Discrete-Event Systems under Partial ObservationRenyuan Zhang, Kai Cai, W. M. Wonham
Recently we developed supervisor localization, a top-down approach to distributed control of discrete-event systems. Its essence is the allocation of monolithic (global) control action among the local control strategies of individual agents. In this paper, we extend supervisor localization by considering partial observation; namely not all events are observable. Specifically, we employ the recently proposed concept of relative observability to compute a partial-observation monolithic supervisor, and then design a suitable localization procedure to decompose the supervisor into a set of local controllers. In the resulting local controllers, only observable events can cause state change. Further, to deal with large-scale systems, we combine the partial-observation supervisor localization with an efficient architectural synthesis approach: first compute a heterarchical array of partial-observation decentralized supervisors and coordinators, and then localize each of these supervisors/coordinators into local controllers.
SYMay 9, 2011
Convergence Time Analysis of Quantized Gossip Consensus on DigraphsKai Cai, Hideaki Ishii
We have recently proposed quantized gossip algorithms which solve the consensus and averaging problems on directed graphs with the least restrictive connectivity requirements. In this paper we study the convergence time of these algorithms. To this end, we investigate the shrinking time of the smallest interval that contains all states for the consensus algorithm, and the decay time of a suitable Lyapunov function for the averaging algorithm. The investigation leads us to characterizing the convergence time by the hitting time in certain special Markov chains. We simplify the structures of state transition by considering the special case of complete graphs, where every edge can be activated with an equal probability, and derive polynomial upper bounds on convergence time.
SYApr 13, 2016
Relative Coobservability in Decentralized Supervisory Control of Discrete-Event SystemsKai Cai, Renyuan Zhang, W. M. Wonham
We study the new concept of relative coobservability in decentralized supervisory control of discrete-event systems under partial observation. This extends our previous work on relative observability from a centralized setup to a decentralized one. A fundamental concept in decentralized supervisory control is coobservability (and its several variations); this property is not, however, closed under set union, and hence there generally does not exist the supremal element. Our proposed relative coobservability, although stronger than coobservability, is algebraically well-behaved, and the supremal relatively coobservable sublanguage of a given language exists. We present an algorithm to compute this supremal sublanguage. Moreover, relative coobservability is weaker than conormality, which is also closed under set union; unlike conormality, relative coobservability imposes no constraint on disabling unobservable controllable events.
SYMar 15, 2019
Supervisor Localization of Timed Discrete-Event Systems under Partial Observation and Communication DelayRenyuan Zhang, Kai Cai
We study supervisor localization for timed discrete-event systems under partial observation and communication delay in the Brandin-Wonham framework. First, we employ timed relative observability to synthesize a partial-observation monolithic supervisor; the control actions of this supervisor include not only disabling action of prohibitible events (as that of controllable events in the untimed case) but also "clock-preempting" action of forcible events. Accordingly we decompose the supervisor into a set of partial-observation local controllers one for each prohibitible event, as well as a set of partial-observation local preemptors one for each forcible event. We prove that these local controllers and preemptors collectively achieve the same controlled behavior as the partial-observation monolithic supervisor does. Moreover, we propose channel models for inter-agent event communication and impose bounded and unbounded delay as temporal specifications. In this formulation, there exist multiple distinct observable event sets; thus we employ timed relative coobservability to synthesize partial-observation decentralized supervisors, and then localize these supervisors into local controllers and preemptors. The above results are illustrated by a timed workcell example.
SYOct 24, 2017
Supervisor Localization of Discrete-Event Systems with Infinite BehaviorRenyuan Zhang, Kai Cai
Recently we developed supervisor localization, a top-down approach to distributed control of discrete-event systems (DES) with finite behavior. Its essence is the allocation of monolithic (global) control action among the local control strategies of individual agents. In this report, we extend supervisor localization to study the distributed control of DES with infinite behavior. Specifically, we first employ Thistle and Wonham's supervisory control theory for DES with infinite behavior to compute a safety supervisor (for safety specifications) and a liveness supervisor (for liveness specifications), and then design a suitable localization procedure to decompose the safety supervisor into a set of safety local controllers, one for each controllable event, and decompose the liveness supervisor into a set of liveness local controllers, two for each controllable event. The localization procedure for decomposing the liveness supervisor is novel; in particular, a local controller is responsible for disabling the corresponding controllable event on only part of the states of the liveness supervisor, and consequently, the derived local controller in general has states number no more than that computed by considering the disablement on all the states. Moreover, we prove that the derived local controllers achieve the same controlled behavior with the safety and liveness supervisors. We finally illustrate the result by a Small Factory example.
SYJun 23, 2016
On Distributed Internal Model Principle for Output Regulation over Time-Varying Networks of Linear Heterogeneous AgentsKai Cai
We study a multi-agent output regulation problem, where not all agents have access to the exosystem's dynamics. We propose a distributed controller that solves the problem for linear, heterogeneous, and uncertain agent dynamics as well as time-varying directed networks. The distributed controller consists of two parts: (1) an exosystem generator that creates a local copy of the exosystem dynamics by using consensus protocols, and (2) a dynamic compensator that uses (again) consensus to approach the internal model of the exosystem and thereby achieves perfect output regulation. Our approach leverages methods from internal model based controller synthesis, multi-agent consensus over directed networks, and stability of time-varying linear systems; the derived result is an adaptation of the (centralized) internal model principle to the distributed, networked setting.
SYDec 8, 2019
Distributed Robust Output Regulation of Heterogeneous Uncertain Linear Agents by Adaptive Internal Model PrincipleSatoshi Kawamura, Kai Cai, Masako Kishida
We study a multi-agent output regulation problem, where not all agents have access to the exosystem's dynamics. We propose a fully distributed controller that solves the problem for linear, heterogeneous, and uncertain agent dynamics as well as time-varying directed networks. The distributed controller consists of two parts: (1) an exosystem generator that locally estimates the exosystem dynamics, and (2) a dynamic compensator that, by locally approaching the internal model of the exosystem, achieves perfect output regulation. Moreover, we extend this distributed controller to solve an output synchronization problem where not all agents initially have the same internal model dynamics. Our approach leverages methods from internal model based controller synthesis and multi-agent consensus over time-varying directed networks; the derived result is a generalization of the (centralized) internal model principle to the distributed, networked setting.
SYApr 3
Data-Driven Synthesis of Probabilistic Controlled Invariant Sets for Linear MDPsKazumune Hashimoto, Shunki Kimura, Kazunobu Serizawa et al.
We study data-driven computation of probabilistic controlled invariant sets (PCIS) for safety-critical reinforcement learning under unknown dynamics. Assuming a linear MDP model, we use regularized least squares and self-normalized confidence bounds to construct a conservative estimate of the states from which the system can be kept inside a prescribed safe region over an \(N\)-step horizon, together with the corresponding set-valued safe action map. This construction is obtained through a backward recursion and can be interpreted as a conservative approximation of the \(N\)-step safety predecessor operator. When the associated conservative-inclusion event holds, a conservative fixed point of the approximate recursion can be certified as an \((N,ε)\)-PCIS with confidence at least \(η\). For continuous state spaces, we introduce a lattice abstraction and a Lipschitz-based discretization error bound to obtain a tractable approximation scheme. Finally, we use the resulting conservative fixed-point approximation as a runtime candidate PCIS in a practical shielding architecture with iterative updates, and illustrate the approach on a numerical experiment.
SEJan 23
SWE-Pruner: Self-Adaptive Context Pruning for Coding AgentsYuhang Wang, Yuling Shi, Mo Yang et al.
LLM agents have demonstrated remarkable capabilities in software development, but their performance is hampered by long interaction contexts, which incur high API costs and latency. While various context compression approaches such as LongLLMLingua have emerged to tackle this challenge, they typically rely on fixed metrics such as PPL, ignoring the task-specific nature of code understanding. As a result, they frequently disrupt syntactic and logical structure and fail to retain critical implementation details. In this paper, we propose SWE-Pruner, a self-adaptive context pruning framework tailored for coding agents. Drawing inspiration from how human programmers "selectively skim" source code during development and debugging, SWE-Pruner performs task-aware adaptive pruning for long contexts. Given the current task, the agent formulates an explicit goal (e.g., "focus on error handling") as a hint to guide the pruning targets. A lightweight neural skimmer (0.6B parameters) is trained to dynamically select relevant lines from the surrounding context given the goal. Evaluations across four benchmarks and multiple models validate SWE-Pruner's effectiveness in various scenarios, achieving 23-54% token reduction on agent tasks like SWE-Bench Verified while even improving success rates, and up to 14.84x compression on single-turn tasks like LongCodeQA with minimal performance impact.
AIJan 21, 2025
UI-TARS: Pioneering Automated GUI Interaction with Native AgentsYujia Qin, Yining Ye, Junjie Fang et al.
This paper introduces UI-TARS, a native GUI agent model that solely perceives the screenshots as input and performs human-like interactions (e.g., keyboard and mouse operations). Unlike prevailing agent frameworks that depend on heavily wrapped commercial models (e.g., GPT-4o) with expert-crafted prompts and workflows, UI-TARS is an end-to-end model that outperforms these sophisticated frameworks. Experiments demonstrate its superior performance: UI-TARS achieves SOTA performance in 10+ GUI agent benchmarks evaluating perception, grounding, and GUI task execution. Notably, in the OSWorld benchmark, UI-TARS achieves scores of 24.6 with 50 steps and 22.7 with 15 steps, outperforming Claude (22.0 and 14.9 respectively). In AndroidWorld, UI-TARS achieves 46.6, surpassing GPT-4o (34.5). UI-TARS incorporates several key innovations: (1) Enhanced Perception: leveraging a large-scale dataset of GUI screenshots for context-aware understanding of UI elements and precise captioning; (2) Unified Action Modeling, which standardizes actions into a unified space across platforms and achieves precise grounding and interaction through large-scale action traces; (3) System-2 Reasoning, which incorporates deliberate reasoning into multi-step decision making, involving multiple reasoning patterns such as task decomposition, reflection thinking, milestone recognition, etc. (4) Iterative Training with Reflective Online Traces, which addresses the data bottleneck by automatically collecting, filtering, and reflectively refining new interaction traces on hundreds of virtual machines. Through iterative training and reflection tuning, UI-TARS continuously learns from its mistakes and adapts to unforeseen situations with minimal human intervention. We also analyze the evolution path of GUI agents to guide the further development of this domain.
AISep 2, 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement LearningHaoming Wang, Haoyang Zou, Huatong Song et al. · pku
The development of autonomous agents for graphical user interfaces (GUIs) presents major challenges in artificial intelligence. While recent advances in native agent models have shown promise by unifying perception, reasoning, action, and memory through end-to-end learning, open problems remain in data scalability, multi-turn reinforcement learning (RL), the limitations of GUI-only operation, and environment stability. In this technical report, we present UI-TARS-2, a native GUI-centered agent model that addresses these challenges through a systematic training methodology: a data flywheel for scalable data generation, a stabilized multi-turn RL framework, a hybrid GUI environment that integrates file systems and terminals, and a unified sandbox platform for large-scale rollouts. Empirical evaluation demonstrates that UI-TARS-2 achieves significant improvements over its predecessor UI-TARS-1.5. On GUI benchmarks, it reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld, outperforming strong baselines such as Claude and OpenAI agents. In game environments, it attains a mean normalized score of 59.8 across a 15-game suite-roughly 60% of human-level performance-and remains competitive with frontier proprietary models (e.g., OpenAI o3) on LMGame-Bench. Additionally, the model can generalize to long-horizon information-seeking tasks and software engineering benchmarks, highlighting its robustness across diverse agent tasks. Detailed analyses of training dynamics further provide insights into achieving stability and efficiency in large-scale agent RL. These results underscore UI-TARS-2's potential to advance the state of GUI agents and exhibit strong generalization to real-world interactive scenarios.
SYAug 26, 2017
Supervisor Localization for Large-Scale Discrete-Event Systems under Partial ObservationRenyuan Zhang, Kai Cai
Recently we developed partial-observation supervisor localization, a top-down approach to distributed control of discrete-event systems (DES) under partial observation. Its essence is the decomposition of the partial-observation monolithic supervisor into partial-observation local controllers for individual controllable events. In this paper we extend the partial-observation supervisor localization to large-scale DES, for which the monolithic supervisor may be incomputable. Specifically, we first employ an efficient heterarchical supervisor synthesis procedure to compute a heterarchical array of partial-observation decentralized supervisors and partial-observation coordinators. Then we localize each of these supervisors/coordinators into partial-observation local controllers. This procedure suggests a systematic approach to the distributed control of large-scale DES under partial observation. The results are illustrated by a system of automatic guided vehicles (AGV) serving a manufacturing workcell.
SYAug 16, 2017
Top-Down Synthesis of Multi-Agent Formation Control: An Eigenstructure Assignment based ApproachTakatoshi Motoyama, Kai Cai
We propose a top-down approach for formation control of heterogeneous multi-agent systems, based on the method of eigenstructure assignment. Given the problem of achieving scalable formations on the plane, our approach globally computes a state feedback control that assigns desired closed-loop eigenvalues/eigenvectors. We characterize the relation between the eigenvalues/eigenvectors and the resulting inter-agent communication topology, and design special (sparse) topologies such that the synthesized control may be implemented locally by the individual agents. Moreover, we present a hierarchical synthesis procedure that significantly improves computational efficiency. Finally, we extend the proposed approach to achieve rigid formation and circular motion, and illustrate these results by simulation examples.
SYSep 8, 2016
Characterizations and Effective Computation of Supremal Relatively Observable SublanguagesKai Cai, Renyuan Zhang, W. M. Wonham
Recently we proposed relative observability for supervisory control of discrete-event systems under partial observation. Relative observability is closed under set unions and hence there exists the supremal relatively observable sublanguage of a given language. In this paper we present a new characterization of relative observability, based on which an operator on languages is proposed whose largest fixpoint is the supremal relatively observable sublanguage. Iteratively applying this operator yields a monotone sequence of languages; exploiting the linguistic concept of support based on Nerode equivalence, we prove for regular languages that the sequence converges finitely to the supremal relatively observable sublanguage, and the operator is effectively computable. Moreover, for the purpose of control, we propose a second operator that in the regular case computes the supremal relatively observable and controllable sublanguage. The computational effectiveness of the operator is demonstrated on a case study.