Zachary I. Bell

h-index12

5papers

523citations

Novelty56%

AI Score31

Ranked #130,132 of 194,257 authors (top 67%)#28,645 in LG (top 71%)

5 Papers

1.2SYMar 15, 2018

A Switched Systems Approach to Path Following with Intermittent State Feedback

Hsi-Yuan Chen, Zachary I. Bell, Patryk Deptula et al.

Autonomous agents are often tasked with operating in an area where feedback is unavailable. Inspired by such applications, this paper develops a novel switched systems-based control method for uncertain nonlinear systems with temporary loss of state feedback. To compensate for intermittent feedback, an observer is used while state feedback is available to reduce the estimation error, and a predictor is utilized to propagate the estimates while state feedback is unavailable. Based on the resulting subsystems, maximum and minimum dwell time conditions are developed via a Lyapunov-based switched systems analysis to relax the constraint of maintaining constant feedback. Using the dwell time conditions, a switching trajectory is developed to enter and exit the feedback denied region in a manner that ensures the overall switched system remains stable. A scheme for designing a switching trajectory with a smooth transition function is provided. Simulation and experimental results are presented to demonstrate the performance of control design.

4.9SYJun 17

Safe Output-Feedback Adaptive Optimal Control of Input-Constrained Control-Affine Nonlinear Systems

Tochukwu Elijah Ogri, Muzaffar Qureshi, Zachary I. Bell et al.

In this paper, a novel online, safe output-feedback, critic-only, adaptive optimal control framework is developed for safety-critical control of partially observable systems. The developed framework ensures system stability and safety, regardless of the lack of full-state measurements, while learning and implementing a near-optimal controller. The approach leverages linear matrix inequality-based observer design methods to efficiently search for observer gains for effective state estimation. Then, approximate dynamic programming is used to develop an approximate controller that uses simulated experiences to guarantee the safety and stability of the closed-loop system. Safety is enforced by adding a recentered robust Lyapunov-like barrier function to the cost function that effectively enforces safety constraints, even in the presence of state estimation errors. Lyapunov-based stability analysis is used to guarantee uniform ultimate boundedness of the trajectories of the closed-loop system and ensure safety. Simulation studies are performed to demonstrate the effectiveness of the developed method through two real-world safety-critical scenarios, specifically one ensuring that the state trajectories of a given system remain within a given set, and the other ensuring that the system avoids an obstacle.

1.2SYMay 15, 2025

System Identification and Control Using Lyapunov-Based Deep Neural Networks without Persistent Excitation: A Concurrent Learning Approach

Rebecca G. Hart, Omkar Sudhir Patil, Zachary I. Bell et al.

Deep Neural Networks (DNNs) are increasingly used in control applications due to their powerful function approximation capabilities. However, many existing formulations focus primarily on tracking error convergence, often neglecting the challenge of identifying the system dynamics using the DNN. This paper presents the first result on simultaneous trajectory tracking and online system identification using a DNN-based controller, without requiring persistent excitation. Two new concurrent learning adaptation laws are constructed for the weights of all the layers of the DNN, achieving convergence of the DNN's parameter estimates to a neighborhood of their ideal values, provided the DNN's Jacobian satisfies a finite-time excitation condition. A Lyapunov-based stability analysis is conducted to ensure convergence of the tracking error, weight estimation errors, and observer errors to a neighborhood of the origin. Simulations performed on a range of systems and trajectories, with the same initial and operating conditions, demonstrated 40.5% to 73.6% improvement in function approximation performance compared to the baseline, while maintaining a similar tracking error and control effort. Simulations evaluating function approximation capabilities on data points outside of the trajectory resulted in 58.88% and 74.75% improvement in function approximation compared to the baseline.

4.6LGOct 18, 2024

Inverse Reinforcement Learning from Non-Stationary Learning Agents

Kavinayan P. Sivakumar, Yi Shen, Zachary Bell et al.

In this paper, we study an inverse reinforcement learning problem that involves learning the reward function of a learning agent using trajectory data collected while this agent is learning its optimal policy. To address this problem, we propose an inverse reinforcement learning method that allows us to estimate the policy parameters of the learning agent which can then be used to estimate its reward function. Our method relies on a new variant of the behavior cloning algorithm, which we call bundle behavior cloning, and uses a small number of trajectories generated by the learning agent's policy at different points in time to learn a set of policies that match the distribution of actions observed in the sampled trajectories. We then use the cloned policies to train a neural network model that estimates the reward function of the learning agent. We provide a theoretical analysis to show a complexity result on bound guarantees for our method that beats standard behavior cloning as well as numerical experiments for a reinforcement learning problem that validate the proposed method.

2.6LGOct 18, 2024

Transfer Reinforcement Learning in Heterogeneous Action Spaces using Subgoal Mapping

Kavinayan P. Sivakumar, Yan Zhang, Zachary Bell et al.

In this paper, we consider a transfer reinforcement learning problem involving agents with different action spaces. Specifically, for any new unseen task, the goal is to use a successful demonstration of this task by an expert agent in its action space to enable a learner agent learn an optimal policy in its own different action space with fewer samples than those required if the learner was learning on its own. Existing transfer learning methods across different action spaces either require handcrafted mappings between those action spaces provided by human experts, which can induce bias in the learning procedure, or require the expert agent to share its policy parameters with the learner agent, which does not generalize well to unseen tasks. In this work, we propose a method that learns a subgoal mapping between the expert agent policy and the learner agent policy. Since the expert agent and the learner agent have different action spaces, their optimal policies can have different subgoal trajectories. We learn this subgoal mapping by training a Long Short Term Memory (LSTM) network for a distribution of tasks and then use this mapping to predict the learner subgoal sequence for unseen tasks, thereby improving the speed of learning by biasing the agent's policy towards the predicted learner subgoal sequence. Through numerical experiments, we demonstrate that the proposed learning scheme can effectively find the subgoal mapping underlying the given distribution of tasks. Moreover, letting the learner agent imitate the expert agent's policy with the learnt subgoal mapping can significantly improve the sample efficiency and training time of the learner agent in unseen new tasks.