Cedric Langbort

GT
h-index2
13papers
129citations
Novelty41%
AI Score41

13 Papers

OCApr 27, 2012
Optimal Structured Static State-Feedback Control Design with Limited Model Information for Fully-Actuated Systems

Farhad Farokhi, Cedric Langbort, Karl H. Johansson

We introduce the family of limited model information control design methods, which construct controllers by accessing the plant's model in a constrained way, according to a given design graph. We investigate the closed-loop performance achievable by such control design methods for fully-actuated discrete-time linear time-invariant systems, under a separable quadratic cost. We restrict our study to control design methods which produce structured static state feedback controllers, where each subcontroller can at least access the state measurements of those subsystems that affect its corresponding subsystem. We compute the optimal control design strategy (in terms of the competitive ratio and domination metrics) when the control designer has access to the local model information and the global interconnection structure of the plant-to-be-controlled. Lastly, we study the trade-off between the amount of model information exploited by a control design method and the best closed-loop performance (in terms of the competitive ratio) of controllers it can produce.

GTSep 29, 2017
Strategic Communication Between Prospect Theoretic Agents over a Gaussian Test Channel

Venkata Sriram Siddhardh Nadendla, Emrah Akyol, Cedric Langbort et al.

In this paper, we model a Stackelberg game in a simple Gaussian test channel where a human transmitter (leader) communicates a source message to a human receiver (follower). We model human decision making using prospect theory models proposed for continuous decision spaces. Assuming that the value function is the squared distortion at both the transmitter and the receiver, we analyze the effects of the weight functions at both the transmitter and the receiver on optimal communication strategies, namely encoding at the transmitter and decoding at the receiver, in the Stackelberg sense. We show that the optimal strategies for the behavioral agents in the Stackelberg sense are identical to those designed for unbiased agents. At the same time, we also show that the prospect-theoretic distortions at both the transmitter and the receiver are both larger than the expected distortion, thus making behavioral agents less contended than unbiased agents. Consequently, the presence of cognitive biases increases the need for transmission power in order to achieve a given distortion at both transmitter and receiver.

AIJun 24, 2023
Pointwise-in-Time Explanation for Linear Temporal Logic Rules

Noel Brindise, Cedric Langbort

The new field of Explainable Planning (XAIP) has produced a variety of approaches to explain and describe the behavior of autonomous agents to human observers. Many summarize agent behavior in terms of the constraints, or ''rules,'' which the agent adheres to during its trajectories. In this work, we narrow the focus from summary to specific moments in individual trajectories, offering a ''pointwise-in-time'' view. Our novel framework, which we define on Linear Temporal Logic (LTL) rules, assigns an intuitive status to any rule in order to describe the trajectory progress at individual time steps; here, a rule is classified as active, satisfied, inactive, or violated. Given a trajectory, a user may query for status of specific LTL rules at individual trajectory time steps. In this paper, we present this novel framework, named Rule Status Assessment (RSA), and provide an example of its implementation. We find that pointwise-in-time status assessment is useful as a post-hoc diagnostic, enabling a user to systematically track the agent's behavior with respect to a set of rules.

16.7LGApr 18
Live LTL Progress Tracking: Towards Task-Based Exploration

Noel Brindise, Cedric Langbort, Melkior Ornik

Motivated by the challenge presented by non-Markovian objectives in reinforcement learning (RL), we present a novel framework to track and represent the progress of autonomous agents through complex, multi-stage tasks. Given a specification in finite linear temporal logic (LTL), the framework establishes a 'tracking vector' which updates at each time step in a trajectory rollout. The values of the vector represent the status of the specification as the trajectory develops, assigning true, false, or 'open' labels (where 'open' is used for indeterminate cases). Applied to an LTL formula tree, the tracking vector can be used to encode detailed information about how a task is executed over a trajectory, providing a potential tool for new performance metrics, diverse exploration, and reward shaping. In this paper, we formally present the framework and algorithm, collectively named Live LTL Progress Tracking, give a simple working example, and demonstrate avenues for its integration into RL models. Future work will apply the framework to problems such as task-space exploration and diverse solution-finding in RL.

LGDec 11, 2023
Online Decision Making with History-Average Dependent Costs (Extended)

Vijeth Hebbar, Cedric Langbort

In many online sequential decision-making scenarios, a learner's choices affect not just their current costs but also the future ones. In this work, we look at one particular case of such a situation where the costs depend on the time average of past decisions over a history horizon. We first recast this problem with history dependent costs as a problem of decision making under stage-wise constraints. To tackle this, we then propose the novel Follow-The-Adaptively-Regularized-Leader (FTARL) algorithm. Our innovative algorithm incorporates adaptive regularizers that depend explicitly on past decisions, allowing us to enforce stage-wise constraints while simultaneously enabling us to establish tight regret bounds. We also discuss the implications of the length of history horizon on design of no-regret algorithms for our problem and present impossibility results when it is the full learning horizon.

LGJun 11, 2025
"What are my options?": Explaining RL Agents with Diverse Near-Optimal Alternatives (Extended)

Noel Brindise, Vijeth Hebbar, Riya Shah et al.

In this work, we provide an extended discussion of a new approach to explainable Reinforcement Learning called Diverse Near-Optimal Alternatives (DNA), first proposed at L4DC 2025. DNA seeks a set of reasonable "options" for trajectory-planning agents, optimizing policies to produce qualitatively diverse trajectories in Euclidean space. In the spirit of explainability, these distinct policies are used to "explain" an agent's options in terms of available trajectory shapes from which a human user may choose. In particular, DNA applies to value function-based policies on Markov decision processes where agents are limited to continuous trajectories. Here, we describe DNA, which uses reward shaping in local, modified Q-learning problems to solve for distinct policies with guaranteed epsilon-optimality. We show that it successfully returns qualitatively different policies that constitute meaningfully different "options" in simulation, including a brief comparison to related approaches in the stochastic optimization field of Quality Diversity. Beyond the explanatory motivation, this work opens new possibilities for exploration and adaptive planning in RL.

MLJun 11, 2024
Any-Time Regret-Guaranteed Algorithm for Control of Linear Quadratic Systems

Jafar Abbaszadeh Chekan, Cedric Langbort

We propose a computationally efficient algorithm that achieves anytime regret of order $\mathcal{O}(\sqrt{t})$, with explicit dependence on the system dimensions and on the solution of the Discrete Algebraic Riccati Equation (DARE). Our approach builds on the SDP-based framework of \cite{cohen2019learning}, using an appropriately tuned regularization and a sufficiently accurate initial estimate to construct confidence ellipsoids for control design. A carefully designed input-perturbation mechanism is incorporated to ensure anytime performance. We develop two variants of the algorithm. The first enforces a notion of strong sequential stability, requiring each policy to be stabilizing and successive policies to remain close. However, enforcing this notion results in a suboptimal regret scaling. The second removes the sequential-stability requirement and instead requires only that each generated policy be stabilizing. Closed-loop stability is then preserved through a dwell-time-inspired policy-update rule, adapting ideas from switched-systems control to carefully balance exploration and exploitation. This class of algorithms also addresses key shortcomings of most existing approaches including certainty-equivalence-based methods which typically guarantee stability only in the Lyapunov sense and lack explicit uniform high-probability bounds on the state trajectory expressed in system-theoretic terms. Our analysis explicitly characterizes the trade-off between state amplification and regret, and shows that partially relaxing the sequential-stability requirement yields optimal regret. Finally, our method eliminates the need for any a priori bound on the norm of the DARE solution, an assumption required by all existing computationally efficient optimism in the face of uncertainty (OFU) based algorithms, and thereby removes the reliance of regret guarantees on such external inputs.

SYOct 9, 2018
Detection and Mitigation of Biasing Attacks on Distributed Estimation Networks

Mohammad Deghat, Valery Ugrinovskii, Iman Shames et al.

The paper considers a problem of detecting and mitigating biasing attacks on networks of state observers targeting cooperative state estimation algorithms. The problem is cast within the recently developed framework of distributed estimation utilizing the vector dissipativity approach. The paper shows that a network of distributed observers can be endowed with an additional attack detection layer capable of detecting biasing attacks and correcting their effect on estimates produced by the network. An example is provided to illustrate the performance of the proposed distributed attack detector.

MLFeb 19, 2018
On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization

Venkata Sriram Siddhardh Nadendla, Cedric Langbort

Revealed preference theory studies the possibility of modeling an agent's revealed preferences and the construction of a consistent utility function. However, modeling agent's choices over preference orderings is not always practical and demands strong assumptions on human rationality and data-acquisition abilities. Therefore, we propose a simple generative choice model where agents are assumed to generate the choice probabilities based on latent factor matrices that capture their choice evaluation across multiple attributes. Since the multi-attribute evaluation is typically hidden within the agent's psyche, we consider a signaling mechanism where agents are provided with choice information through private signals, so that the agent's choices provide more insight about his/her latent evaluation across multiple attributes. We estimate the choice model via a novel multi-stage matrix factorization algorithm that minimizes the average deviation of the factor estimates from choice data. Simulation results are presented to validate the estimation performance of our proposed algorithm.

GTJan 27, 2017
Optimal Communication Strategies in Networked Cyber-Physical Systems with Adversarial Elements

Emrah Akyol, Kenneth Rose, Tamer Basar et al.

This paper studies optimal communication and coordination strategies in cyber-physical systems for both defender and attacker within a game-theoretic framework. We model the communication network of a cyber-physical system as a sensor network which involves one single Gaussian source observed by many sensors, subject to additive independent Gaussian observation noises. The sensors communicate with the estimator over a coherent Gaussian multiple access channel. The aim of the receiver is to reconstruct the underlying source with minimum mean squared error. The scenario of interest here is one where some of the sensors are captured by the attacker and they act as the adversary (jammer): they strive to maximize distortion. The receiver (estimator) knows the captured sensors but still cannot simply ignore them due to the multiple access channel, i.e., the outputs of all sensors are summed to generate the estimator input. We show that the ability of transmitter sensors to secretly agree on a random event, that is "coordination", plays a key role in the analysis...

CRJul 12, 2016
Scalar Quadratic-Gaussian Soft Watermarking Games

Kivanc Mihcak, Emrah Akyol, Tamer Basar et al.

We introduce the zero-sum game problem of soft watermarking: The hidden information (watermark) comes from a continuum and has a perceptual value; the receiver generates an estimate of the embedded watermark to minimize the expected estimation error (unlike the conventional watermarking schemes where both the hidden information and the receiver output are from a discrete finite set). Applications include embedding a multimedia content into another. We consider in this paper the scalar Gaussian case and use expected mean-squared distortion. We formulate the resulting problem as a zero-sum game between the encoder & receiver pair and the attacker. We show that for the lin- ear encoder, the optimal attacker is Gaussian-affine, derive the optimal system parameters in that case, and discuss the corresponding system behavior. We also provide numerical results to gain further insight and understanding of the system behavior at optimality.

SYSep 17, 2016
Detection of Biasing Attacks on Distributed Estimation Networks

Mohammad Deghat, Valery Ugrinovskii, Iman Shames et al.

The paper addresses the problem of detecting attacks on distributed estimator networks that aim to intentionally bias process estimates produced by the network. It provides a sufficient condition, in terms of the feasibility of certain linear matrix inequalities, which guarantees distributed input attack detection using an $H_\infty$ approach.

GTJun 25, 2015
Estimation with Strategic Sensors

Farhad Farokhi, Andre M. H. Teixeira, Cedric Langbort

We introduce a model of estimation in the presence of strategic, self-interested sensors. We employ a game-theoretic setup to model the interaction between the sensors and the receiver. The cost function of the receiver is equal to the estimation error variance while the cost function of the sensor contains an extra term which is determined by its private information. We start by the single sensor case in which the receiver has access to a noisy but honest side information in addition to the message transmitted by a strategic sensor. We study both static and dynamic estimation problems. For both these problems, we characterize a family of equilibria in which the sensor and the receiver employ simple strategies. Interestingly, for the dynamic estimation problem, we find an equilibrium for which the strategic sensor uses a memory-less policy. We generalize the static estimation setup to multiple sensors with synchronous communication structure (i.e., all the sensors transmit their messages simultaneously). We prove the maybe surprising fact that, for the constructed equilibrium in affine strategies, the estimation quality degrades as the number of sensors increases. However, if the sensors are herding (i.e., copying each other policies), the quality of the receiver's estimation improves as the number of sensors increases. Finally, we consider the asynchronous communication structure (i.e., the sensors transmit their messages sequentially).