84.2SYApr 28
Distributed adaptive estimation for stochastic large regression modelsDie Gan, Siyu Xie, Zhixin Liu et al.
This paper studies the distributed adaptiveestimation problems for stochastic large regression modelswith an infinite number of parameters. By constructing a re-cursive local cost function, we propose a novel distributedrecursive least squares algorithm to estimate the unknownsystem parameters, where the growth rate of regressors'dimension is characterized by a non-decreasing positivefunction. The almost sure convergence of the proposedalgorithm is established under a cooperative excitationcondition, which incorporates the temporal information andthe spatial information to reflect the cooperative effectamong multiple agents. Moreover, we analyze the predic-tion error by establishing the asymptotic upper boundof the accumulated regret without any excitation condi-tions. The main difficulty of theoretical analysis lies in howto analyze properties of the product of non-independentand non-stationary random matrices, whose dimensionschange over time simultaneously. Some techniques, suchas stochastic Lyapunov function, double-array martingaletheory and algebraic graph theory, are employed to dealwith the above issue. Our theoretical results are derivedwithout imposing independence or stationarity assump-tions on the regression vectors, thereby not excluding thecorrelated feedback signals.
AIFeb 16, 2025Code
Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First TimeZongyuan Li, Chang Lu, Xiaojie Xu et al.
Since the emergence of the Large Language Model (LLM), LLM has been widely used in fields such as writing, translating, and searching. However, there is still great potential for LLM-based methods in handling complex tasks such as decision-making in the StarCraft II environment. To address problems such as lack of relevant knowledge and poor control over subtasks of varying importance, we propose a Hierarchical Expert Prompt (HEP) for LLM. Our method improves the understanding of game situations through expert-level tactical knowledge, improving the processing quality of tasks of varying importance through a hierarchical framework. Our approach defeated the highest level (Elite) standard built-in agent in TextStarCraft II for the first time and consistently outperformed the baseline method in other difficulties. Our experiments suggest that the proposed method is a practical solution for tackling complex decision-making challenges. The replay video can be viewed on https://www.bilibili.com/video/BV1uz42187EF and https://youtu.be/dO3PshWLV5M, and our codes have been open-sourced on https://github.com/luchang1113/HEP-LLM-play-StarCraftII.
RONov 1, 2020Code
MRPB 1.0: A Unified Benchmark for the Evaluation of Mobile Robot Local Planning ApproachesJian Wen, Xuebo Zhang, Qingchen Bi et al.
Local planning is one of the key technologies for mobile robots to achieve full autonomy and has been widely investigated. To evaluate mobile robot local planning approaches in a unified and comprehensive way, a mobile robot local planning benchmark called MRPB 1.0 is newly proposed in this paper. The benchmark facilitates both motion planning researchers who want to compare the performance of a new local planner relative to many other state-of-the-art approaches as well as end users in the mobile robotics industry who want to select a local planner that performs best on some problems of interest. We elaborately design various simulation scenarios to challenge the applicability of local planners, including large-scale, partially unknown, and dynamic complex environments. Furthermore, three types of principled evaluation metrics are carefully designed to quantitatively evaluate the performance of local planners, wherein the safety, efficiency, and smoothness of motions are comprehensively considered. We present the application of the proposed benchmark in two popular open-source local planners to show the practicality of the benchmark. In addition, some insights and guidelines about the design and selection of local planners are also provided. The benchmark website contains all data of the designed simulation scenarios, detailed descriptions of these scenarios, and example code.
AINov 8, 2024
LLM-PySC2: Starcraft II learning environment for Large Language ModelsZongyuan Li, Yanan Ni, Runnan Qi et al.
The tremendous potential has been demonstrated by large language models (LLMs) in intelligent decision-making problems, with unprecedented capabilities shown across diverse applications ranging from gaming AI systems to complex strategic planning frameworks. However, the StarCraft II platform, which has been widely adopted for validating decision-making algorithms in the past decade, has not yet provided substantial support for this emerging domain. To address issues that LLMs cannot interface with the hundreds of actions of the pysc2 backend and the lack of native support for multi-agent (MA) collaboration, we propose the LLM-PySC2 environment. This is the first environment that offers LLMs the complete pysc2 action space with sufficient multi-modal information and game Wiki knowledge. With an asynchronous query architecture, the environment efficiently interacts with LLMs that maintain a constant latency regardless of the scale of the agents' population. In the experiments, we evaluated LLMs' decision-making performance in both the macro-decision and micro-operation scenarios, with traditional StarCraft II Multi-Agent Challenge (SMAC) tasks and a series of new proposed. Results indicate that LLMs possess the potential to achieve victories in complex scenarios but cannot constantly generate correct decisions, especially in the recovered pysc2 action space and MA settings. Without task-relevant instructions, the pre-trained models suffer from issues such as hallucinations and inefficient collaboration. Our findings suggest that StarCraft II still challenges in the era of large models, revealing that there is a lot to do to develop an advanced LLM decision-making system, and the proposed LLM-PySC2 environment will support future development of LLM-based decision-making solutions.
AIMay 2, 2025
Retrieval Augmented Learning: A Retrial-based Large Language Model Self-Supervised Learning and Autonomous Knowledge GenerationZongyuan Li, Pengfei Li, Runnan Qi et al.
The lack of domain-specific data in the pre-training of Large Language Models (LLMs) severely limits LLM-based decision systems in specialized applications, while post-training a model in the scenarios requires significant computational resources. In this paper, we present Retrial-Augmented Learning (RAL), a reward-free self-supervised learning framework for LLMs that operates without model training. By developing Retrieval-Augmented Generation (RAG) into a module for organizing intermediate data, we realized a three-stage autonomous knowledge generation of proposing a hypothesis, validating the hypothesis, and generating the knowledge. The method is evaluated in the LLM-PySC2 environment, a representative decision-making platform that combines sufficient complexity with domain-specific knowledge requirements. Experiments demonstrate that the proposed method effectively reduces hallucination by generating and utilizing validated knowledge, and increases decision-making performance at an extremely low cost. Meanwhile, the approach exhibits potential in out-of-distribution(OOD) tasks, robustness, and transferability, making it a cost-friendly but effective solution for decision-making problems and autonomous knowledge generation.
AIFeb 19, 2025
Reflection of Episodes: Learning to Play Game from Expert and Self ExperiencesXiaojie Xu, Zongyuan Li, Chang Lu et al.
StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for artificial intelligence and reinforcement learning research. To address the problem of Large Language Model(LLM) learning in complex environments through self-reflection, we propose a Reflection of Episodes(ROE) framework based on expert experience and self-experience. This framework first obtains key information in the game through a keyframe selection method, then makes decisions based on expert experience and self-experience. After a game is completed, it reflects on the previous experience to obtain new self-experience. Finally, in the experiment, our method beat the robot under the Very Hard difficulty in TextStarCraft II. We analyze the data of the LLM in the process of the game in detail, verified its effectiveness.
OCJun 17, 2024
Two-Timescale Optimization Framework for Sparse-Feedback Linear-Quadratic Optimal ControlLechen Feng, Yuan-Hua Ni, Xuebo Zhang
A $\mathcal{H}_2$-guaranteed sparse-feedback linear-quadratic (LQ) optimal control with convex parameterization and convex-bounded uncertainty is studied in this paper, where $\ell_0$-penalty is added into the $\mathcal{H}_2$ cost to penalize the number of communication links among distributed controllers. Then, the sparse-feedback gain is investigated to minimize the modified $\mathcal{H}_2$ cost together with the stability guarantee, and the corresponding main results are of three parts. First, the $\ell_1$ relaxation sparse-feedback LQ problem is of concern, and a two-timescale algorithm is developed based on proximal coordinate descent and primal-dual splitting approach. Second, piecewise quadratic relaxation sparse-feedback LQ control is investigated, which exhibits an accelerated convergence rate. Third, sparse-feedback LQ problem with $\ell_0$-penalty is directly studied through BSUM (Block Successive Upper-bound Minimization) framework, and precise approximation method and variational properties are introduced.
ROJan 31, 2022
G$ \mathbf{^2} $VD Planner: Efficient Motion Planning With Grid-based Generalized Voronoi DiagramsJian Wen, Xuebo Zhang, Qingchen Bi et al.
In this paper, an efficient motion planning approach with grid-based generalized Voronoi diagrams (G$ \mathbf{^2} $VD) is newly proposed for mobile robots. Different from existing approaches, the novelty of this work is twofold: 1) a new state lattice-based path searching approach is proposed, in which the search space is reduced to a novel Voronoi corridor to further improve the search efficiency; 2) an efficient quadratic programming-based path smoothing approach is presented, wherein the clearance to obstacles is considered to improve the path clearance of hard-constrained path smoothing approaches. We validate the efficiency and smoothness of our approach in various challenging simulation scenarios and outdoor environments. It is shown that the computational efficiency is improved by 17.1% in the path searching stage, and path smoothing with the proposed approach is 6.6 times faster than an advanced sparse-banded structure-based path smoothing approach and 53.3 times faster than the popular timed-elastic-band planner. A video showing outdoor navigation on our campus is available at https://youtu.be/iMXGthgvp58.
RODec 16, 2020
E$ \mathbf{^3} $MoP: Efficient Motion Planning Based on Heuristic-Guided Motion Primitives Pruning and Path Optimization With Sparse-Banded StructureJian Wen, Xuebo Zhang, Haiming Gao et al.
To solve the autonomous navigation problem in complex environments, an efficient motion planning approach is newly presented in this paper. Considering the challenges from large-scale, partially unknown complex environments, a three-layer motion planning framework is elaborately designed, including global path planning, local path optimization, and time-optimal velocity planning. Compared with existing approaches, the novelty of this work is twofold: 1) a novel heuristic-guided pruning strategy of motion primitives is proposed and fully integrated into the state lattice-based global path planner to further improve the computational efficiency of graph search, and 2) a new soft-constrained local path optimization approach is proposed, wherein the sparse-banded system structure of the underlying optimization problem is fully exploited to efficiently solve the problem. We validate the safety, smoothness, flexibility, and efficiency of our approach in various complex simulation scenarios and challenging real-world tasks. It is shown that the computational efficiency is improved by 66.21% in the global planning stage and the motion efficiency of the robot is improved by 22.87% compared with the recent quintic Bézier curve-based state space sampling approach. We name the proposed motion planning framework E$ \mathrm{^3} $MoP, where the number 3 not only means our approach is a three-layer framework but also means the proposed approach is efficient in three stages.
ROJan 7, 2019
CAE-RLSM: Consistent and Efficient Redundant Line Segment Merging for Online Feature Map BuildingJian Wen, Xuebo Zhang, Haiming Gao et al.
In order to obtain a compact line segment-based map representation for localization and planning of mobile robots, it is necessary to merge redundant line segments which physically represent the same part of the environment in different scans. In this paper, a consistent and efficient redundant line segment merging approach (CAE-RLSM) is proposed for online feature map building. The proposed CAE-RLSM is composed of two newly proposed modules: one-to-many incremental line segment merging (OTM-ILSM) and multi-processing global map adjustment (MP-GMA). Different from state-of-the-art offline merging approaches, the proposed CAE-RLSM can achieve real-time mapping performance, which not only reduces the redundancy of incremental merging with high efficiency, but also solves the problem of global map adjustment after loop closing to guarantee global consistency. Furthermore, a new correlation-based evaluation metric is proposed for the quality evaluation of line segment maps. This evaluation metric does not require manual measurement of the environmental metric information, instead it makes full use of globally consistent laser scans obtained by simultaneous localization and mapping (SLAM) systems to compare the performance of different line segment-based mapping approaches in an objective and fair manner. Comparative experimental results with respect to a mean shift-based offline redundant line segment merging approach (MS-RLSM) and an offline version of one-to-one incremental line segment merging approach (O$^2$TO-ILSM) on both public data sets and self-recorded data set are presented to show the superior performance of CAE-RLSM in terms of efficiency and map quality in different scenarios.
RODec 8, 2018
Real-time Acceleration-continuous Path-constrained Trajectory Planning With Built-in Tradability Between Cruise and Time-optimal MotionsPeiyao Shen, Xuebo Zhang, Yongchun Fang
In this paper, a novel real-time acceleration-continuous path-constrained trajectory planning algorithm is proposed with an appealing built-in tradability mechanism between cruise motion and time-optimal motion. Different from existing approaches, the proposed approach smoothens time-optimal trajectories with bang-bang input structures to generate acceleration-continuous trajectories while preserving the completeness property. More importantly, a novel built-in tradability mechanism is proposed and embedded into the trajectory planning framework, so that the proportion of the cruise motion and time-optimal motion can be flexibly adjusted by changing a user-specified functional parameter. Thus, the user can easily apply the trajectory planning algorithm for various tasks with different requirements on motion efficiency and cruise proportion. Moreover, it is shown that feasible trajectories are computed more quickly than optimal trajectories. Rigorous mathematical analysis and proofs are provided for these aforementioned results. Comparative simulation and experimental results on omnidirectional wheeled mobile robots demonstrate the capability of the proposed algorithm in terms of flexible tunning between cruise and time-optimal motions, as well as higher computational efficiency.
ROOct 10, 2016
Essential Properties of Numerical Integration for Time-optimal Trajectory Planning Along a Specified PathPeiyao Shen, Xuebo Zhang, Yongchun Fang
This letter summarizes some known properties and also presents several new properties of the Numerical Integration (NI) method for time-optimal trajectory planning along a specified path. The contribution is that rigorous mathematical proofs of these properties are presented, most of which cannot be found in existing literatures. We first give some properties regarding switch points and accelerating/decelerating curves of the NI method. Then, for the fact that when kinematic constraints are considered, the original version of NI which only considers torque constraints may result in failure of trajectory planning, we give the concrete failure conditions with rigorous mathematical proof. Accordingly, a failure detection algorithm is given in a run-and-test manner. Some simulation results on a unicycle vehicle are provided to verify those presented properties. Note that though those known properties are not discovered first, their mathematical proofs are given first in this letter. The detailed proofs make the theory of NI more complete and help interested readers to gain a thorough understanding of the method.