SYFeb 7, 2017
Optimal Tracking Performance of Control Systems with Two-Channel ConstraintsChao-Yang Chen, Bin Hu, Zhi-Hong Guanb et al.
This paper focuses on the tracking performance limitation for a class of networked control systems(NCSs) with two-channel constraints. In communication channels, we consider bandwidth, energy constraints and additive colored Gaussian noise(ACGN) simultaneously. In plant, non-minimal zeros and unstable poles are considered; multi-repeated zeros and poles are also applicable. To obtain the optimal performance, the two-parameter controller is adopted. The theoretical results show that the optimal tracking performance is influenced by the non-minimum phase zeros, unstable poles, gain at all frequencies of the given plant, and the reference input signal for NCSs. Moreover, the performance limitation is also affected by the limited bandwidth, additive colored Gaussian noise, and the corresponding multiples for the non-minimum phase zeros and unstable poles. Additionally, the channel minimal input power constraints are given under the condition ensuring the stability of the system and acquiring system performance limitation. Finally, simulation examples are given to illustrate the theoretical results.
SYApr 19, 2016
A hybrid approach for cooperative output regulation with sampled compensatorChao Yang, Zhi-Hong Guan, Ming Chi et al.
This work investigates the cooperative output regulation problem of linear multi-agent systems with hybrid sampled data control. Due to the limited data sensing and communication, in many practical situations, only sampled data are available for the cooperation of multi-agent systems. To overcome this problem, a distributed hybrid controller is presented for the cooperative output regulation, and cooperative output regulation is achieved by well designed state feedback law. Then it proposed a method for the designing of sampled data controller to solve the cooperative output regulation problem with continuous linear systems and discrete-time communication data. Finally, numerical simulation example for cooperative tracking and a simulation example for optimal control of micro-grids are proposed to illustrate the result of the sampled data control law.
OCOct 17, 2022
Learning Decentralized Linear Quadratic Regulators with $\sqrt{T}$ RegretLintao Ye, Ming Chi, Ruiquan Liao et al.
We propose an online learning algorithm that adaptively designs a decentralized linear quadratic regulator when the system model is unknown a priori and new data samples from a single system trajectory become progressively available. The algorithm uses a disturbance-feedback representation of state-feedback controllers coupled with online convex optimization with memory and delayed feedback. Under the assumption that the system is stable or given a known stabilizing controller, we show that our controller enjoys an expected regret that scales as $\sqrt{T}$ with the time horizon $T$ for the case of partially nested information pattern. For more general information patterns, the optimal controller is unknown even if the system model is known. In this case, the regret of our controller is shown with respect to a linear sub-optimal controller. We validate our theoretical findings using numerical experiments.
LGMar 28
Online Learning of Kalman Filtering: From Output to State EstimationLintao Ye, Ankang Zhang, Ming Chi et al.
In this paper, we study the problem of learning Kalman filtering with unknown system model in partially observed linear dynamical systems. We propose a unified algorithmic framework based on online optimization that can be used to solve both the output estimation and state estimation scenarios. By exploring the properties of the estimation error cost functions, such as conditionally strong convexity, we show that our algorithm achieves a $\log T$-regret in the horizon length $T$ for the output estimation scenario. More importantly, we tackle the more challenging scenario of learning Kalman filtering for state estimation, which is an open problem in the literature. We first characterize a fundamental limitation of the problem, demonstrating the impossibility of any algorithm to achieve sublinear regret in $T$. By further introducing a random query scheme into our algorithm, we show that a $\sqrt{T}$-regret is achievable when rendering the algorithm limited query access to more informative measurements of the system state in practice. Our algorithm and regret readily capture the trade-off between the number of queries and the achieved regret, and shed light on online learning problems with limited observations. We validate the performance of our algorithms using numerical examples.
LGMay 11
Learning to Sparsify Stochastic Linear BanditsZhengmiao Wang, Ming Chi, Zhi-Wei Liu et al.
This paper addresses the problem of learning to sparsify stochastic linear bandits, where a decision-maker sequentially selects actions from a high-dimensional space subject to a sparsity constraint on the number of nonzero elements in the action vector. The key challenge lies in minimizing cumulative regret while tackling the potential NP-hardness of finding optimal sparse actions due to the inherent combinatorial structure of the problem. We propose an adaptively phased exploration and exploitation algorithmic framework, utilizing ordinary least squares for parameter learning and specialized subroutines for sparse action selection. When the action set is a Euclidean ball, optimal sparse actions can be efficiently computed, enabling us to establish a $\tilde{\mathcal{O}}(d\sqrt{T})$ regret, where $d$ is the dimension of the action vector and $T$ is the time horizon length. For general convex and compact action sets where finding optimal sparse actions is intractable, we employ a greedy subroutine. For general strongly convex action sets, we derive a $\tilde{\mathcal{O}}(d \sqrt{T})$ $α$-regret; for general compact sets lacking strong convexity, we establish a $\tilde{\mathcal{O}}(d T^{2/3})$ $α$-regret, where $α$ pertains to the approximation ratio of the greedy algorithm. Finally, we validate the performance of our algorithms using extensive experiments including an application to recommendation system.
SYJan 27
Output Feedback Stabilization of Linear Systems via Policy Gradient MethodsAnkang Zhang, Ming Chi, Xiaoling Wang et al.
Stabilizing a dynamical system is a fundamental problem that serves as a cornerstone for many complex tasks in the field of control systems. The problem becomes challenging when the system model is unknown. Among the Reinforcement Learning (RL) algorithms that have been successfully applied to solve problems pertaining to unknown linear dynamical systems, the policy gradient (PG) method stands out due to its ease of implementation and can solve the problem in a model-free manner. However, most of the existing works on PG methods for unknown linear dynamical systems assume full-state feedback. In this paper, we take a step towards model-free learning for partially observable linear dynamical systems with output feedback and focus on the fundamental stabilization problem of the system. We propose an algorithmic framework that stretches the boundary of PG methods to the problem without global convergence guarantees. We show that by leveraging zeroth-order PG update based on system trajectories and its convergence to stationary points, the proposed algorithms return a stabilizing output feedback policy for discrete-time linear dynamical systems. We also explicitly characterize the sample complexity of our algorithm and verify the effectiveness of the algorithm using numerical examples.
OCOct 31, 2024
Online Convex Optimization with Memory and Limited PredictionsLintao Ye, Zhengmiao Wang, Zhi-Wei Liu et al.
We study the problem of online convex optimization with memory and predictions over a horizon $T$. At each time step, a decision maker is given some limited predictions of the cost functions from a finite window of future time steps, i.e., values of the cost function at certain decision points in the future. The decision maker then chooses an action and incurs a cost given by a convex function that depends on the actions chosen in the past. We propose an algorithm to solve this problem and show that the dynamic regret of the algorithm decays exponentially with the prediction window length. Our algorithm contains two general subroutines that work for wider classes of problems. The first subroutine can solve general online convex optimization with memory and bandit feedback with $\sqrt{T}$-dynamic regret with respect to $T$. The second subroutine is a zeroth-order method that can be used to solve general convex optimization problems with a linear convergence rate that matches the best achievable rate of first-order methods for convex optimization. The key to our algorithm design and analysis is the use of truncated Gaussian smoothing when querying the decision points for obtaining the predictions. We complement our theoretical results using numerical experiments.
SYJul 26, 2016
Task-space coordinated tracking of multiple heterogeneous manipulators via controller-estimator approachesMing-Feng Ge, Zhi-Hong Guan, Chao Yang et al.
This paper studies the task-space coordinated tracking of a time-varying leader for multiple heterogeneous manipulators (MHMs), containing redundant manipulators and nonredundant ones. Different from the traditional coordinated control, distributed controller-estimator algorithms (DCEA), which consist of local algorithms and networked algorithms, are developed for MHMs with parametric uncertainties and input disturbances. By invoking differential inclusions, nonsmooth analysis, and input-to-state stability, some conditions (including sufficient conditions, necessary and sufficient conditions) on the asymptotic stability of the task-space tracking errors and the subtask errors are developed. Simulation results are given to show the effectiveness of the presented DCEA.