5 Papers

69.6LGMay 31
Interaction-Limited Safe Continuous-Time RL for Dynamical Medical Treatment

Xun Shen, Yuepeng Wang, Akifumi Wachi et al.

Dynamic medical treatment requires deciding treatment intensity and intervention timing, while patient states evolve continuously and adverse events may occur between clinical interactions. Most existing treatment learning methods assume fixed schedules or enforce safety only at discrete decision points. We propose Interaction-Limited Safe Continuous-Time Reinforcement Learning, a framework that jointly optimizes treatment administration and clinical interaction timing under trajectory-level safety constraints. Our key idea is to reformulate the continuous time treatment problem as an option-based semi-Markov decision process, where each option specifies a continuous-time treatment policy and its duration. We develop a safety-tightening mechanism showing that suitably constructed constraints at interaction times guarantee safety over the full continuous-time trajectory with high probability. We further establish finite-sample guarantees for policy learning from logged treatment trajectories and introduce a practical data-driven conservative surrogate. Experiments show that the proposed adaptive interaction-timing mechanism improves both safety and treatment effectiveness over equidistant interaction schemes across different safe policy optimization methods.

48.4OCApr 14
Finite-Time Optimization via Scaled Gradient-Momentum Flows

Yu Zhou, Mengmou Li, Masaaki Nagahara

In this paper, we develop a scaled gradient-momentum framework for continuous-time optimization that achieves global finite-time convergence. A state-dependent scaling mechanism is introduced to enable classical dynamics, such as Heavy-Ball-type and proportional-integral (PI)-type flows, to attain finite-time convergence. We establish explicit conditions that bridge the gradient-dominance property of the objective function and finite-time stability of the proposed scaled dynamics. Numerical experiments validate the theoretical results.

38.9SYApr 17
A Common Lyapunov Matrix Approach to the Exponential Stability of Augmented Primal-Dual Gradient Flow as LPV Systems

Mengmou Li, Lijun Zhu, Masaaki Nagahara

We show that a common Lyapunov matrix exists for the convex combination of two Hurwitz matrices if and only if the intersection of the set of strict Lyapunov matrices for one matrix and the set of non-strict Lyapunov matrices for the other is nonempty. This simple relaxation is useful for the convergence analysis of the augmented primal-dual gradient flow for constrained optimization problems with affine inequality constraints, which can be viewed as a polytopic linear parameter-varying (LPV) system driven by the active-constraint selector. Under a relaxed strong convexity condition, exponential convergence is proved for the LPV system. The analysis can further be extended to the integral quadratic constraints (IQCs) framework for LPV systems to facilitate numerical search of the convergence rate.

OCJul 28, 2024
Small-Gain Theorem Based Distributed Prescribed-Time Convex Optimization For Networked Euler-Lagrange Systems

Gewei Zuo, Mengmou Li, Lijun Zhu

In this paper, we address the distributed prescribed-time convex optimization (DPTCO) for a class of networked Euler-Lagrange systems under undirected connected graphs. By utilizing position-dependent measured gradient value of local objective function and local information interactions among neighboring agents, a set of auxiliary systems is constructed to cooperatively seek the optimal solution. The DPTCO problem is then converted to the prescribed-time stabilization problem of an interconnected error system. A prescribed-time small-gain criterion is proposed to characterize prescribed-time stabilization of the system, offering a novel approach that enhances the effectiveness beyond existing asymptotic or finite-time stabilization of an interconnected system. Under the criterion and auxiliary systems, innovative adaptive prescribed-time local tracking controllers are designed for subsystems. The prescribed-time convergence lies in the introduction of time-varying gains which increase to infinity as time tends to the prescribed time. Lyapunov function together with prescribed-time mapping are used to prove the prescribed-time stability of closed-loop system as well as the boundedness of internal signals. Finally, theoretical results are verified by one numerical example.

20.5OCApr 3
A Canonical Structure for Constructing Projected First-Order Algorithms With Delayed Feedback

Mengmou Li, Yu Zhou, Xun Shen et al.

This work introduces a canonical structure for a broad class of unconstrained first-order algorithms that admit a Lur'e representation, including systems with relative degree greater than one, e.g., systems with delayed gradient feedback. The proposed canonical structure is obtained through a simple linear transformation. It enables a direct extension from unconstrained optimization algorithms to set-constrained ones through projection in a Lyapunov-induced norm. The resulting projected algorithms attain the optimal solution while preserving the convergence rates of their unconstrained counterparts.