Dimitra Panagou

34papers

324citations

Novelty48%

AI Score54

Ranked #26,789 of 201,326 authors (top 13%)#53 in SY (top 5%)

34 Papers

ROMay 24

A Formal gatekeeper Framework for Safe Dual Control with Active Exploration

Kaleb Ben Naveed, Devansh R. Agrawal, Dimitra Panagou

Planning safe trajectories under model uncertainty is a fundamental challenge. Robust planning ensures safety by considering worst-case realizations, yet ignores uncertainty reduction and leads to overly conservative behavior. Actively reducing uncertainty on-the-fly during a nominal mission defines the dual control problem. Most approaches address this by adding a weighted exploration term to the cost, tuned to trade off the nominal objective and uncertainty reduction, but without formal consideration of when exploration is beneficial. Moreover, safety is enforced in some methods but not in others. We propose a framework that integrates robust planning with active exploration under formal guarantees as follows: The key innovation and contribution is that exploration is pursued only when it provides a verifiable improvement without compromising safety. To achieve this, we utilize our earlier work on gatekeeper as an architecture for safety verification, and extend it so that it generates both safe and informative trajectories that reduce uncertainty and the cost of the mission, or keep it within a user-defined budget. The methodology is evaluated via simulation case studies on the online dual control of a quadrotor under parametric uncertainty.

ROJun 2

Learning to Adapt Control Barrier Functions Under Epistemic and Aleatoric Uncertainty

Taekyung Kim, Robin Inho Kee, Dimitra Panagou

Control barrier functions (CBFs) provide a tractable mechanism for enforcing safety constraints in robotic systems, but their practical performance depends strongly on the choice of class-K function parameters. Under input constraints, conservative parameters often preserve feasibility at the cost of slow progress, whereas aggressive parameters can make the CBF-based optimization infeasible or unsafe. This paper proposes Online Adaptive CBF (OA-CBF), a framework for adapting CBF parameters at runtime. We introduce the notion of locally validated CBF parameters, which certify candidate parameters over a finite prediction horizon, and show that safety is preserved when such validation is maintained over successive update intervals. To identify locally validated parameters efficiently, OA-CBF trains a probabilistic ensemble neural network to evaluate queried CBF parameters rather than directly predict a single parameter. A graph-attention encoder represents variable-size obstacle environments, an epistemic uncertainty gate calibrated by conformal prediction rejects unreliable predictions, and a distributionally robust CVaR condition screens aleatoric risk. Among the verified candidates, OA-CBF selects the parameter with the best predicted progress metric and applies it through either an MPC-CBF or CBF-QP safety filter. Simulation studies on dynamic unicycle, planar and three-dimensional quadrotor, kinematic bicycle, and VTOL quadplane benchmarks show that OA-CBF reduces the conservatism of fixed-parameter CBF controllers while maintaining low collision and infeasibility rates.

OCAug 26, 2019

Control-Lyapunov and Control-Barrier Functions based Quadratic Program for Spatio-temporal Specifications

Kunal Garg, Dimitra Panagou

This paper presents a method for control synthesis under spatio-temporal constraints. First, we consider the problem of reaching a set $S$ in a user-defined or prescribed time $T$. We define a new class of control Lyapunov functions, called prescribed-time control Lyapunov functions (PT CLF), and present sufficient conditions on the existence of a controller for this problem in terms of PT CLF. Then, we formulate a quadratic program (QP) to compute a control input that satisfies these sufficient conditions. Next, we consider control synthesis under spatio-temporal objectives given as: the closed-loop trajectories remain in a given set $S_s$ at all times; and, remain in a specific set $S_i$ during the time interval $[t_i, t_{i+1})$ for $i = 0, 1, \cdots, N$; and, reach the set $S_{i+1}$ on or before $t = t_{i+1}$. We show that such spatio-temporal specifications can be translated into temporal logic formulas. We present sufficient conditions on the existence of a control input in terms of PT CLF and control barrier functions. Then, we present a QP to compute the control input efficiently, and show its feasibility under the assumptions of existence of a PT CLF. To the best of authors' knowledge, this is the first paper proposing a QP based method for the aforementioned problem of satisfying spatio-temporal specifications for nonlinear control-affine dynamics with input constraints. We also discuss the limitations of the proposed methods and directions of future work to overcome these limitations. We present numerical examples to corroborate our proposed methods.

CRApr 15

Digital Guardians: The Past and The Future of Cyber-Physical Resilience

Saurabh Bagchi, Hyunseung Kim, Tarek Abdelzaher et al.

Resilience in cyber-physical systems (CPS) is the fundamental ability to maintain safety and critical functionality despite adverse "perturbations," which includes security attacks, environmental disruptions, and hardware or software failures. This survey provides a comprehensive review of CPS resilience, framing the field through five interconnected themes that are required in an integrated whole to achieve real-world resilience. The article first posits that resilience is a system-wide property emerging from interactions between hardware, software, and human users. Second, it addresses the challenges of learning-enabled CPS, which often operate in data-scarce environments characterized by imbalanced or noisy data, requiring innovative solutions like synthetic data generation and foundation model adaptation. Third, the survey examines proactive measures for resilience, which include distinctive aspects of verification, testing, and redundancy. Fourth, it explores recovery mechanisms, moving beyond traditional fault models to design "just good enough" recovery strategies that prioritize safety-critical functions during perturbations. Finally, it highlights the central role of the human, focusing on the different levels of human intervention, the necessity of trust calibration, and the requirement for explainable AI to support human-CPS teaming. These themes are illustrated through representative application domains, primarily Connected and Autonomous Transportation Systems (CATS) and Medical CPS (MCPS). By integrating the five interconnected themes, this survey provides a systematic roadmap for achieving the resilient CPS in increasingly complex and adversarial environments.

SYFeb 27, 2018

Resilient Leader-Follower Consensus to Arbitrary Reference Values

James Usevitch, Dimitra Panagou

The problem of consensus in the presence of misbehaving agents has increasingly attracted attention in the literature. Prior results have established algorithms and graph structures for multi-agent networks which guarantee the consensus of normally behaving agents in the presence of a bounded number of misbehaving agents. The final consensus value is guaranteed to fall within the convex hull of initial agent states. However, the problem of consensus tracking considers consensus to arbitrary reference values which may not lie within such bounds. Conditions for consensus tracking in the presence of misbehaving agents has not been fully studied. This paper presents conditions for a network of agents using the W-MSR algorithm to achieve this objective.

SYOct 3, 2017

r-Robustness and (r,s)-Robustness of Circulant Graphs

James Usevitch, Dimitra Panagou

There has been recent growing interest in graph theoretical properties known as r- and (r,s)-robustness. These properties serve as sufficient conditions guaranteeing the success of certain consensus algorithms in networks with misbehaving agents present. Due to the complexity of determining the robustness for an arbitrary graph, several methods have previously been proposed for identifying the robustness of specific classes of graphs or constructing graphs with specified robustness levels. The majority of such approaches have focused on undirected graphs. In this paper we identify a class of scalable directed graphs whose edge set is determined by a parameter k and prove that the robustness of these graphs is also determined by k. We support our results through computer simulations.

SYMar 16, 2019

Prescribed-time convergence with input constraints: A control Lyapunov function based approach

Kunal Garg, Ehsan Arabi, Dimitra Panagou

In this paper, we present a control framework for a general class of control-affine nonlinear systems under spatiotemporal and input constraints. Specifically, the proposed control architecture addresses the problem of reaching a given final set $S$ in a prescribed (user-defined) time with bounded control inputs. To this end, a time transformation technique is utilized to transform the system subject to temporal constraints into an equivalent form without temporal constraints. The transformation is defined so that asymptotic convergence in the transformed time scale results into prescribed-time convergence in the original time scale. To incorporate input constraints, we characterize a set of initial conditions $D_M$ such that starting from this set, the closed-loop trajectories reach the set $S$ within the prescribed time. We further show that starting from outside the set $D_M$, the system trajectories reach the set $D_M$ in a finite time that depends upon the initial conditions and the control input bounds. We use a novel parameter $μ$ in the controller, that controls the convergence-rate of the closed-loop trajectories and dictates the size of the set $D_M$. Finally, we present a numerical example to showcase the efficacy of our proposed method.

SYMay 3, 2018

Finite-Time Resilient Formation Control with Bounded Inputs

James Usevitch, Kunal Garg, Dimitra Panagou

In this paper we consider the problem of a multi-agent system achieving a formation in the presence of misbehaving or adversarial agents. We introduce a novel continuous time resilient controller to guarantee that normally behaving agents can converge to a formation with respect to a set of leaders. The controller employs a norm-based filtering mechanism, and unlike most prior algorithms, also incorporates input bounds. In addition, the controller is shown to guarantee convergence in finite time. A sufficient condition for the controller to guarantee convergence is shown to be a graph theoretical structure which we denote as Resilient Directed Acyclic Graph (RDAG). Further, we employ our filtering mechanism on a discrete time system which is shown to have exponential convergence. Our results are demonstrated through simulations.

SYMar 15, 2019

Herding an Adversarial Attacker to a Safe Area for Defending Safety-Critical Infrastructure

Vishnu S Chipade, Dimitra Panagou

This paper investigates a problem of defending safety-critical infrastructure from an adversarial aerial attacker in an urban environment. A circular arc formation of defenders is formed around the attacker, and vector-field based guidance laws herd the attacker to a predefined safe area in the presence of rectangular obstacles. The defenders' formation is defined based on a novel vector field that imposes super-elliptic contours around the obstacles, to closely resemble their rectangular shape. A novel finite-time stabilizing controller is proposed to guide the defenders to their desired formation while avoiding obstacles and inter-agent collisions. The efficiency of the approach is demonstrated via simulation results.

DSJul 4, 2018

New Results on Finite-Time Stability: Geometric Conditions and Finite-Time Controllers

Kunal Garg, Dimitra Panagou

This paper presents novel controllers that yield finite-time stability for linear systems. We first present a sufficient condition for the origin of a scalar system to be finite-time stable. Then we present novel finite-time controllers based on vector fields and barrier functions to demonstrate the utility of this geometric condition. We also consider the general class of linear controllable systems, and present a continuous feedback control law to stabilize the system in finite time. Finally, we present simulation results for each of these cases, showing the efficacy of the designed control laws.

OCMar 19

Feasibility Analysis and Constraint Selection in Optimization-Based Controllers

Panagiotis Rousseas, Haejoon Lee, Dimos V. Dimarogonas et al.

Control synthesis under constraints is at the forefront of research on autonomous systems, in part due to its broad application from low-level control to high-level planning, where computing control inputs is typically cast as a constrained optimization problem. Assessing feasibility of the constraints and selecting among subsets of feasible constraints is a challenging yet crucial problem. In this work, we provide a novel theoretical analysis that yields necessary and sufficient conditions for feasibility assessment of linear constraints and based on this analysis, we develop novel methods for feasible constraint selection in the context of control of autonomous systems. Through a series of simulations, we demonstrate that our algorithms achieve performance comparable to state-of-the-art methods while offering improved computational efficiency. Importantly, our analysis provides a novel theoretical framework for assessing, analyzing and handling constraint infeasibility.

SYSep 15, 2017

Chebyshev Approximation and Higher Order Derivatives of Lyapunov Functions for Estimating the Domain of Attraction

Dongkun Han, Dimitra Panagou

Estimating the Domain of Attraction (DA) of non-polynomial systems is a challenging problem. Taylor expansion is widely adopted for transforming a nonlinear analytic function into a polynomial function, but the performance of Taylor expansion is not always satisfactory. This paper provides solvable ways for estimating the DA via Chebyshev approximation. Firstly, for Chebyshev approximation without the remainder, higher order derivatives of Lyapunov functions are used for estimating the DA, and the largest estimate is obtained by solving a generalized eigenvalue problem. Moreover, for Chebyshev approximation with the remainder, an uncertain polynomial system is reformulated, and a condition is proposed for ensuring the convergence to the largest estimate with a selected Lyapunov function. Numerical examples demonstrate that both accuracy and efficiency are improved compared to Taylor approximation.

SYJun 19, 2019

Determining r-Robustness of Digraphs Using Mixed Integer Linear Programming

James Usevitch, Dimitra Panagou

Convergence guarantees of many resilient consensus algorithms are based on the graph theoretic properties of $r$- and $(r,s)$-robustness. These algorithms guarantee consensus of normally behaving agents in the presence of a bounded number of arbitrarily misbehaving agents if the values of the integers $r$ and $s$ are sufficiently high. However, determining the largest integer $r$ for which an arbitrary digraph is $r$-robust is highly nontrivial. This paper introduces a novel method for calculating this value using mixed integer linear programming. The method only requires knowledge of the graph Laplacian matrix, and can be formulated with affine objective and constraints, except for the integer constraint. Integer programming methods such as branch-and-bound can allow both lower and upper bounds on $r$ to be iteratively tightened. Simulations suggest the proposed method demonstrates greater efficiency than prior algorithms.

SYFeb 24, 2018

Approximating the Region of Multi-Task Coordination via the Optimal Lyapunov-Like Barrier Function

Dongkun Han, Lixing Huang, Dimitra Panagou

We consider the multi-task coordination problem for multi-agent systems under the following objectives: 1. collision avoidance; 2. connectivity maintenance; 3. convergence to desired destinations. The paper focuses on the safety guaranteed region of multi-task coordination (SG-RMTC), i.e., the set of initial states from which all trajectories converge to the desired configuration, while at the same time achieve the multi-task coordination and avoid unsafe sets. In contrast to estimating the domain of attraction via Lyapunov functions, the main underlying idea is to employ the sublevel sets of Lyapunov-like barrier functions to approximate the SG-RMTC. Rather than using fixed Lyapunov-like barrier functions, a systematic way is proposed to search an optimal Lyapunov-like barrier function such that the under-estimate of SG-RMTC is maximized. Numerical examples illustrate the effectiveness of the proposed method.

ROApr 16

Trajectory Planning for Safe Dual Control with Active Exploration

Kaleb Ben Naveed, Manveer Singh, Devansh R. Agrawal et al.

Planning safe trajectories under model uncertainty is a fundamental challenge. Robust planning ensures safety by considering worst-case realizations, yet ignores uncertainty reduction and leads to overly conservative behavior. Actively reducing uncertainty on-the-fly during a nominal mission defines the dual control problem. Most approaches address this by adding a weighted exploration term to the cost, tuned to trade off the nominal objective and uncertainty reduction, but without formal consideration of when exploration is beneficial. Moreover, safety is enforced in some methods but not in others. We study a budget-constrained dual control problem, where uncertainty is reduced subject to safety and a mission-level cost budget that limits the allowable degradation in task performance due to exploration. In this work, we propose Dual-gatekeeper, a framework that integrates robust planning with active exploration under formal guarantees of safety and budget feasibility. The key idea is that exploration is pursued only when it provides a verifiable improvement without compromising safety or violating the budget, enabling the system to balance immediate task performance with long-term uncertainty reduction in a principled manner. We provide two implementations of the framework based on different safety mechanisms and demonstrate its performance on quadrotor navigation and autonomous car racing case studies under parametric uncertainty.

SYJan 5, 2018

Robust Semi-Cooperative Multi-Agent Coordination in the Presence of Stochastic Disturbances

Kunal Garg, Dongkun Han, Dimitra Panagou

This paper presents a robust distributed coordination protocol that achieves generation of collision-free trajectories for multiple unicycle agents in the presence of stochastic uncertainties. We build upon our earlier work on semi-cooperative coordination and we redesign the coordination controllers so that the agents counteract a class of state (wind) disturbances and measurement noise. Safety and convergence is proved analytically, while simulation results demonstrate the efficacy of the proposed solution.

ROSep 26, 2022

FORESEE: Prediction with Expansion-Compression Unscented Transform for Online Policy Optimization

Hardik Parwana, Dimitra Panagou

Propagating state distributions through a generic, uncertain nonlinear dynamical model is known to be intractable and usually begets numerical or analytical approximations. We introduce a method for state prediction, called the Expansion-Compression Unscented Transform, and use it to solve a class of online policy optimization problems. Our proposed algorithm propagates a finite number of sigma points through a state-dependent distribution, which dictates an increase in the number of sigma points at each time step to represent the resulting distribution; this is what we call the expansion operation. To keep the algorithm scalable, we augment the expansion operation with a compression operation based on moment matching, thereby keeping the number of sigma points constant across predictions over multiple time steps. Its performance is empirically shown to be comparable to Monte Carlo but at a much lower computational cost. Under state and control input constraints, the state prediction is subsequently used in tandem with a proposed variant of constrained gradient-descent for online update of policy parameters in a receding horizon fashion. The framework is implemented as a differentiable computational graph for policy training. We showcase our framework for a quadrotor stabilization task as part of a benchmark comparison in safe-control-gym and for optimizing the parameters of a Control Barrier Function based controller in a leader-follower problem.

MAMar 15

R3R: Decentralized Multi-Agent Collision Avoidance with Infinite-Horizon Safety

Thomas Marshall Vielmetti, Devansh R. Agrawal, Dimitra Panagou

Existing decentralized methods for multi-agent motion planning lack formal, infinite-horizon safety guarantees, especially for communication-constrained systems. We present R3R which, to our knowledge, is the first decentralized and asynchronous framework for multi-agent motion planning under range-limited communication constraints with infinite-horizon safety guarantees for systems of nonlinear agents. R3R's novelty lies in combining our gatekeeper safety framework with a geometric constraint termed R-Boundedness, which together establish a formal link between an agent's communication radius and its ability to plan safely. We constrain trajectories to lie within a fixed planning radius, determined by a function of the agent's communication radius. This enables trajectories to be certified as provably safe for all time using only local information. Our algorithm is fully asynchronous, and ensures the forward invariance of these guarantees even in time-varying networks where agents asynchronously join and replan. We evaluate our approach in simulations of up to 128 Dubins vehicles, validating our theoretical safety guarantees in dense, obstacle-rich scenarios. We further show that R3R's computational complexity scales with local agent density rather than problem size, providing a practical solution for scalable and provably safe multi-agent systems.

SYNov 13, 2018

Hybrid Planning and Control for Multiple Fixed-Wing Aircraft under Input Constraints

Kunal Garg, Dimitra Panagou

This paper presents a novel hybrid control protocol for de-conflicting multiple vehicles with constraints on control inputs. We consider turning rate and linear speed constraints to represent fixed-wing or car-like vehicles. A set of state-feedback controllers along with a state-dependent switching logic are synthesized in a hybrid system to generate collision-free trajectories that converge to the desired destinations of the vehicles. The switching law is designed so that the safety can be guaranteed while no Zeno behavior can occur. A novel temporary goal assignment technique is also designed to guarantee convergence. We analyze the individual modes for safety and the closed-loop hybrid system for convergence. The theoretical developments are demonstrated via simulation results.

ROMay 20

Reinforcement Learning for Risk Adaptation via Differentiable CVaR Barrier Functions

Xinyi Wang, Taekyung Kim, Bardh Hoxha et al.

Planning through crowded environments under uncertain obstacle motions remains difficult, as stochastic interactions often induce overly conservative behavior or reduced efficiency. To address this challenge, we propose an end-to-end risk adaptation framework for crowd navigation under obstacle-motion uncertainty modeled by a Gaussian mixture model. The framework combines reinforcement learning~(RL) with a differentiable quadratic-program safety layer based on Conditional Value-at-Risk~(CVaR) barrier functions, jointly learning nominal control input, risk level, and safety margin and enforcing explicit probabilistic safety constraints. This design enables context-aware adaptation, promoting efficient behavior while invoking caution only when necessary. We conduct extensive evaluations in dynamic, uncertain, and crowded environments across varying obstacle densities and robot models, and further assess generalization under three out-of-distribution cases. Comparisons across optimization-based, RL-based, and integrated RL and optimization methods are provided, and the proposed method is shown to deliver the strongest overall performance in safety, efficiency, and generalization under uncertainty.

ROMay 15

Policy Library CBF: Finite-Horizon Safety at Runtime via Parallel Rollouts

Taekyung Kim, Hideki Okamoto, Bardh Hoxha et al.

Safety-critical autonomy in unstructured environments poses significant challenges for online safety certification under evolving constraints. We propose Policy Library Control Barrier Function~(PL-CBF), a runtime safety filter that evaluates a library of fallback policies via parallel finite-horizon rollouts, selects the least invasive safe mode, and enforces safety by solving a quadratic program that minimally modifies a nominal policy. We provide a theoretical analysis based on a finite-horizon language metric over closed-loop behaviors, characterizing policy-library coverage requirements for certifying finite-horizon safety. Simulations on a planar double-integrator (4 states), highway driving with abrupt friction changes using a realistic nonlinear vehicle model (8 states), and 3D quadrotor navigation in crowded dynamic environments (12 states) demonstrate improved safety coverage over single-policy safety filters while retaining millisecond-level runtime.

ROMay 13

Distributionally Robust Safety Under Arbitrary Uncertainties: A Safety Filtering Approach

Daniel M. Cherenson, Haejoon Lee, Taekyung Kim et al.

In this work, we study how to ensure probabilistic safety for nonlinear systems under distributional ambiguity. Our approach builds on a backup-based safety filtering framework that switches between a high-performance nominal policy and a certified backup policy to ensure safety. To handle arbitrary uncertainties from ambiguous distributions, i.e., where the distribution is not of specific structure and the true distribution is unknown, we adopt a distributionally robust (DR) formulation using Wasserstein ambiguity sets. Rather than solving a high-dimensional DR trajectory optimization problem online, we exploit the structure of backup-based safety filtering to reduce safety certification to a one-dimensional search over the switching time between nominal and backup policies. We then develop a sampling-based certification procedure with finite-sample guarantees, where empirical failure probabilities are compared against a Wasserstein-inflated threshold. We validate our method through simulations across three systems, from a Dubins vehicle to a high-speed racing car and a fighter jet, demonstrating the broad applicability and computational efficiency.

ROApr 2

Backup-Based Safety Filters: A Comparative Review of Backup CBF, Model Predictive Shielding, and gatekeeper

Taekyung Kim, Aswin D. Menon, Akshunn Trivedi et al.

This paper revisits three backup-based safety filters -- Backup Control Barrier Functions (Backup CBF), Model Predictive Shielding (MPS), and gatekeeper -- through a unified comparative framework. Using a common safety-filter abstraction and shared notation, we make explicit both their common backup-policy structure and their key algorithmic differences. We compare the three methods through their filter-inactive sets, i.e., the states where the nominal policy is left unchanged. In particular, we show that MPS is a special case of gatekeeper, and we further relate gatekeeper to the interior of the Backup CBF inactive set within the implicit safe set. This unified view also highlights a key source of conservatism in backup-based safety filters: safety is often evaluated through the feasibility of a backup maneuver, rather than through the nominal policy's continued safe execution. The paper is intended as a compact tutorial and review that clarifies the theoretical connections and differences among these methods.

MAMay 9

Robust Multi-Agent LLMs under Byzantine Faults

Haejoon Lee, Vincent-Daniel Yun, Hyeonho Oh et al.

Large language model (LLM) agents increasingly collaborate over peer-to-peer networks to improve their reliability. However, these same interactions can also become a source of vulnerability, as unreliable or Byzantine agents may sway neighboring agents toward incorrect conclusions and degrade overall system performance. Existing methods rely on leader-based coordination or self-reported confidence, both of which are susceptible to adversarial manipulation. We study decentralized LLM multi-agent systems (LLM-MAS) and propose Self-Anchored Consensus (SAC), a fully decentralized iterative filter-and-refine protocol in which agents iteratively exchange responses, locally evaluate and filter unreliable messages, and refine their own outputs. We present $(F{+}1)$-robustness conditions for the communication graph that ensure honest agents preserve and propagate reliable information despite Byzantine influence. Experiments on mathematical and commonsense reasoning benchmarks show that SAC effectively suppresses Byzantine influence and consistently improves performance across diverse communication topologies, whereas prior methods degrade under adversarial conditions.

MAMar 16

Partial Resilient Leader-Follower Consensus in Time-Varying Graphs

Haejoon Lee, Dimitra Panagou

This work studies resilient leader-follower consensus with a bounded number of adversaries. Existing approaches typically require robustness conditions of the entire network to guarantee resilient consensus. However, the behavior of such systems when these conditions are not fully met remains unexplored. To address this gap, we introduce the notion of partial leader-follower consensus, in which a subset of non-adversarial followers successfully tracks the leader's reference state despite insufficient robustness. We propose a novel distributed algorithm - the Bootstrap Percolation and Mean Subsequence Reduced (BP-MSR) algorithm - and establish sufficient conditions for individual followers to achieve consensus via the BP-MSR algorithm in arbitrary time-varying graphs. We validate our findings through simulations, demonstrating that our method guarantees partial leader-follower consensus, even when standard resilient consensus algorithms fail.

MAApr 3

Fully Byzantine-Resilient Distributed Multi-Agent Q-Learning

Haejoon Lee, Dimitra Panagou

We study Byzantine-resilient distributed multi-agent reinforcement learning (MARL), where agents must collaboratively learn optimal value functions over a compromised communication network. Existing resilient MARL approaches typically guarantee almost sure convergence only to near-optimal value functions, or require restrictive assumptions to ensure convergence to optimal solution. As a result, agents may fail to learn the optimal policies under these methods. To address this, we propose a novel distributed Q-learning algorithm, under which all agents' value functions converge almost surely to the optimal value functions despite Byzantine edge attacks. The key idea is a redundancy-based filtering mechanism that leverages two-hop neighbor information to validate incoming messages, while preserving bidirectional information flow. We then introduce a new topological condition for the convergence of our algorithm, present a systematic method to construct such networks, and prove that this condition can be verified in polynomial time. We validate our results through simulations, showing that our method converges to the optimal solutions, whereas prior methods fail under Byzantine edge attacks.

SYApr 7

Staggered Integral Online Conformal Prediction for Safe Dynamics Adaptation with Multi-Step Coverage Guarantees

Daniel M. Cherenson, Dimitra Panagou

Safety-critical control of uncertain, adaptive systems often relies on conservative, worst-case uncertainty bounds that limit closed-loop performance. Online conformal prediction is a powerful data-driven method for quantifying uncertainty when truth values of predicted outputs are revealed online; however, for systems that adapt the dynamics without measurements of the state derivatives, standard online conformal prediction is insufficient to quantify the model uncertainty. We propose Staggered Integral Online Conformal Prediction (SI-OCP), an algorithm utilizing an integral score function to quantify the lumped effect of disturbance and learning error. This approach provides long-run coverage guarantees, resulting in long-run safety when synthesized with safety-critical controllers, including robust tube model predictive control. Finally, we validate the proposed approach through a numerical simulation of an all-layer deep neural network (DNN) adaptive quadcopter using robust tube MPC, highlighting the applicability of our method to complex learning parameterizations and control strategies.

SYDec 22, 2021

IDCAIS: Inter-Defender Collision-Aware Interception Strategy against Multiple Attackers

Vishnu S. Chipade, Xinyi Wang, Dimitra Panagou

In the prior literature on multi-agent area defense games, the assignments of the defenders to the attackers are done based on a cost metric associated only with the interception of the attackers. In contrast to that, this paper presents an Inter-Defender Collision-Aware Interception Strategy (IDCAIS) for defenders to intercept attackers in order to defend a protected area, such that the defender-to-attacker assignment protocol not only takes into account an interception-related cost but also takes into account any possible future collisions among the defenders on their optimal interception trajectories. In particular, in this paper, the defenders are assigned to intercept attackers using a mixed-integer quadratic program (MIQP) that: 1) minimizes the sum of times taken by defenders to capture the attackers under time-optimal control, as well as 2) helps eliminate or delay possible future collisions among the defenders on the optimal trajectories. To prevent inevitable collisions on optimal trajectories or collisions arising due to time-sub-optimal behavior by the attackers, a minimally augmented control using exponential control barrier function (ECBF) is also provided. Simulations show the efficacy of the approach.

OCSep 22, 2021

Recursive Feasibility Guided Optimal Parameter Adaptation of Differential Convex Optimization Policies for Safety-Critical Systems

Hardik Parwana, Dimitra Panagou

Quadratic Program(QP) based state-feedback controllers, whose inequality constraints bound the rate of change of control barrier(CBFs) and lyapunov function with a class-$\mathcal{K}$ function of their values, are sensitive to the parameters of these class-$\mathcal{K}$ functions. The construction of valid CBFs, however, is not straightforward, and for arbitrarily chosen parameters of the QP, the system trajectories may enter states at which the QP either eventually becomes infeasible, or may not achieve desired performance. In this work, we pose the control synthesis problem as a differential policy whose parameters are optimized for performance over a time horizon at high level, thus resulting in a bi-level optimization routine. In the absence of knowledge of the set of feasible parameters, we develop a Recursive Feasibility Guided Gradient Descent approach for updating the parameters of QP so that the new solution performs at least as well as previous solution. By considering the dynamical system as a directed graph over time, this work presents a novel way of optimizing performance of a QP controller over a time horizon for multiple CBFs by (1) using the gradient of its solution with respect to its parameters by employing sensitivity analysis, and (2) backpropagating these as well as system dynamics gradients to update parameters while maintaining feasibility of QPs.

SYJul 10, 2020

Approximate Time-Optimal Trajectories for Damped Double Integrator in 2D Obstacle Environments under Bounded Inputs

Vishnu S. Chipade, Dimitra Panagou

This article provides extensions to existing path-velocity decomposition based time optimal trajectory planning algorithm \cite{kant1986toward} to scenarios in which agents move in 2D obstacle environment under double integrator dynamics with drag term (damped double integrator). Particularly, we extend the idea of a tangent graph \cite{liu1992path} to $\calC^1$-Tangent graph to find continuously differentiable ($\calC^1$) shortest path between any two points. $\calC^1$-Tangent graph has a continuously differentiable ($\calC^1$) path between any two nodes. We also provide analytical expressions for a near time-optimal velocity profile for an agent moving on these shortest paths under the damped double integrator with bounded acceleration.

MAJul 8, 2020

Multi-Swarm Herding: Protecting against Adversarial Swarms

Vishnu S. Chipade, Dimitra Panagou

This paper studies a defense approach against one or more swarms of adversarial agents. In our earlier work, we employ a closed formation (`StringNet') of defending agents (defenders) around a swarm of adversarial agents (attackers) to confine their motion within given bounds, and guide them to a safe area. The control design relies on the assumption that the adversarial agents remain close enough to each other, i.e., within a prescribed connectivity region. To handle situations when the attackers no longer stay within such a connectivity region, but rather split into smaller swarms (clusters) to maximize the chance or impact of attack, this paper proposes an approach to learn the attacking sub-swarms and reassign defenders towards the attackers. We use a `Density-based Spatial Clustering of Application with Noise (DBSCAN)' algorithm to identify the spatially distributed swarms of the attackers. Then, the defenders are assigned to each identified swarm of attackers by solving a constrained generalized assignment problem. Simulations are provided to demonstrate the effectiveness of the approach.

SYSep 2, 2017

Distributed Multi-task Formation Control under Parametric Communication Uncertainties

Dongkun Han, Dimitra Panagou

Formation control is a key problem in the coordination of multiple agents. It arises new challenges to traditional formation control strategy when the communication among agents is affected by uncertainties. This paper considers the robust multi-task formation control problem of multiple non-point agents whose communications are disturbed by uncertain parameters. The control objectives include 1. achieving the desired configuration; 2. avoiding collisions; 3. preserving the connectedness of uncertain topology. To achieve these objectives, firstly, a condition of Linear Matrix Inequalities (LMIs) is proposed for checking the connectedness of an uncertain communication topology. Then, by preserving the initial topological connectedness, a gradient-based distributed controller is designed via Lyapunov-like barrier functions. Two numerical examples illustrate the effectiveness of the proposed method.

ROApr 23, 2014

Motion planning and Collision Avoidance using Non-Gradient Vector Fields. Technical Report

Dimitra Panagou

This paper presents a novel feedback method on the motion planning for unicycle robots in environments with static obstacles, along with an extension to the distributed planning and coordination in multi-robot systems. The method employs a family of 2-dimensional analytic vector fields, whose integral curves exhibit various patterns depending on the value of a parameter lambda. More specifically, for an a priori known value of lambda, the vector field has a unique singular point of dipole type and can be used to steer the unicycle to a goal configuration. Furthermore, for the unique value of lambda that the vector field has a continuum of singular points, the integral curves are used to define flows around obstacles. An almost global feedback motion plan can then be constructed by suitably blending attractive and repulsive vector fields in a static obstacle environment. The method does not suffer from the appearance of sinks (stable nodes) away from goal point. Compared to other similar methods which are free of local minima, the proposed approach does not require any parameter tuning to render the desired convergence properties. The paper also addresses the extension of the method to the distributed coordination and control of multiple robots, where each robot needs to navigate to a goal configuration while avoiding collisions with the remaining robots, and while using local information only. More specifically, based on the results which apply to the single-robot case, a motion coordination protocol is presented which guarantees the safety of the multi-robot system and the almost global convergence of the robots to their goal configurations. The efficacy of the proposed methodology is demonstrated via simulation results in static and dynamic environments.

MAFeb 15, 2014

Decentralized Goal Assignment and Safe Trajectory Generation in Multi-Robot Networks via Multiple Lyapunov Functions

Dimitra Panagou, Matthew Turpin, Vijay Kumar

This paper considers the problem of decentralized goal assignment and trajectory generation for multi-robot networks when only local communication is available, and proposes an approach based on methods related to switched systems and set invariance. A family of Lyapunov-like functions is employed to encode the (local) decision making among candidate goal assignments, under which a group of connected agents chooses the assignment that results in the shortest total distance to the goals. An additional family of Lyapunov-like barrier functions is activated in the case when the optimal assignment may lead to colliding trajectories, maintaining thus system safety while preserving the convergence guarantees. The proposed switching strategies give rise to feedback control policies that are computationally efficient and scalable with the number of agents, and therefore suitable for applications including first-response deployment of robotic networks under limited information sharing. The efficacy of the proposed method is demonstrated via simulation results and experiments with six ground robots.