Todd D. Murphey

RO
h-index1
30papers
1,329citations
Novelty53%
AI Score43

30 Papers

LGSep 26, 2023
Maximum diffusion reinforcement learning

Thomas A. Berrueta, Allison Pinosky, Todd D. Murphey

Robots and animals both experience the world through their bodies and senses. Their embodiment constrains their experiences, ensuring they unfold continuously in space and time. As a result, the experiences of embodied agents are intrinsically correlated. Correlations create fundamental challenges for machine learning, as most techniques rely on the assumption that data are independent and identically distributed. In reinforcement learning, where data are directly collected from an agent's sequential experiences, violations of this assumption are often unavoidable. Here, we derive a method that overcomes this issue by exploiting the statistical mechanics of ergodic processes, which we term maximum diffusion reinforcement learning. By decorrelating agent experiences, our approach provably enables single-shot learning in continuous deployments over the course of individual task attempts. Moreover, we prove our approach generalizes well-known maximum entropy techniques, and robustly exceeds state-of-the-art performance across popular benchmarks. Our results at the nexus of physics, learning, and control form a foundation for transparent and reliable decision-making in embodied reinforcement learning agents.

ROSep 30, 2023
Automated Gait Generation For Walking, Soft Robotic Quadrupeds

Jake Ketchum, Sophia Schiffer, Muchen Sun et al.

Gait generation for soft robots is challenging due to the nonlinear dynamics and high dimensional input spaces of soft actuators. Limitations in soft robotic control and perception force researchers to hand-craft open loop controllers for gait sequences, which is a non-trivial process. Moreover, short soft actuator lifespans and natural variations in actuator behavior limit machine learning techniques to settings that can be learned on the same time scales as robot deployment. Lastly, simulation is not always possible, due to heterogeneity and nonlinearity in soft robotic materials and their dynamics change due to wear. We present a sample-efficient, simulation free, method for self-generating soft robot gaits, using very minimal computation. This technique is demonstrated on a motorized soft robotic quadruped that walks using four legs constructed from 16 "handed shearing auxetic" (HSA) actuators. To manage the dimension of the search space, gaits are composed of two sequential sets of leg motions selected from 7 possible primitives. Pairs of primitives are executed on one leg at a time; we then select the best-performing pair to execute while moving on to subsequent legs. This method -- which uses no simulation, sophisticated computation, or user input -- consistently generates good translation and rotation gaits in as low as 4 minutes of hardware experimentation, outperforming hand-crafted gaits. This is the first demonstration of completely autonomous gait generation in a soft robot.

ROMar 18
Manufacturing Micro-Patterned Surfaces with Multi-Robot Systems

Annalisa T. Taylor, Malachi Landis, Ping Guo et al.

Applying micro-patterns to surfaces has been shown to impart useful physical properties such as drag reduction and hydrophobicity. However, current manufacturing techniques cannot produce micro-patterned surfaces at scale due to high-cost machinery and inefficient coverage techniques such as raster-scanning. In this work, we use multiple robots, each equipped with a patterning tool, to manufacture these surfaces. To allow these robots to coordinate during the patterning task, we use the ergodic control algorithm, which specifies coverage objectives using distributions. We demonstrate that robots can divide complicated coverage objectives by communicating compressed representations of their trajectory history both in simulations and experimental trials. Further, we show that robot-produced patterning can lower the coefficient of friction of metallic surfaces. This work demonstrates that distributed multi-robot systems can coordinate to manufacture products that were previously unrealizable at scale.

CVMar 3, 2025
Data Augmentation for NeRFs in the Low Data Limit

Ayush Gaggar, Todd D. Murphey

Current methods based on Neural Radiance Fields fail in the low data limit, particularly when training on incomplete scene data. Prior works augment training data only in next-best-view applications, which lead to hallucinations and model collapse with sparse data. In contrast, we propose adding a set of views during training by rejection sampling from a posterior uncertainty distribution, generated by combining a volumetric uncertainty estimator with spatial coverage. We validate our results on partially observed scenes; on average, our method performs 39.9% better with 87.5% less variability across established scene reconstruction benchmarks, as compared to state of the art baselines. We further demonstrate that augmenting the training set by sampling from any distribution leads to better, more consistent scene reconstruction in sparse environments. This work is foundational for robotic tasks where augmenting a dataset with informative data is critical in resource-constrained, a priori unknown environments. Videos and source code are available at https://murpheylab.github.io/low-data-nerf/.

ROJun 25, 2021
Active Learning in Robotics: A Review of Control Principles

Annalisa T. Taylor, Thomas A. Berrueta, Todd D. Murphey

Active learning is a decision-making process. In both abstract and physical settings, active learning demands both analysis and action. This is a review of active learning in robotics, focusing on methods amenable to the demands of embodied learning systems. Robots must be able to learn efficiently and flexibly through continuous online deployment. This poses a distinct set of control-oriented challenges -- one must choose suitable measures as objectives, synthesize real-time control, and produce analyses that guarantee performance and safety with limited knowledge of the environment or robot itself. In this work, we survey the fundamental components of robotic active learning systems. We discuss classes of learning tasks that robots typically encounter, measures with which they gauge the information content of observations, and algorithms for generating action plans. Moreover, we provide a variety of examples -- from environmental mapping to nonparametric shape estimation -- that highlight the qualitative differences between learning tasks, information measures, and control techniques. We conclude with a discussion of control-oriented open challenges, including safety-constrained learning and distributed learning.

STAT-MECHJan 3, 2021
Low rattling: A predictive principle for self-organization in active collectives

Pavel Chvykov, Thomas A. Berrueta, Akash Vardhan et al.

Self-organization is frequently observed in active collectives, from ant rafts to molecular motor assemblies. General principles describing self-organization away from equilibrium have been challenging to identify. We offer a unifying framework that models the behavior of complex systems as largely random, while capturing their configuration-dependent response to external forcing. This allows derivation of a Boltzmann-like principle for understanding and manipulating driven self-organization. We validate our predictions experimentally in shape-changing robotic active matter, and outline a methodology for controlling collective behavior. Our findings highlight how emergent order depends sensitively on the matching between external patterns of forcing and internal dynamical response properties, pointing towards future approaches for design and control of active particle mixtures and metamaterials.

RODec 9, 2020
Dynamical System Segmentation for Information Measures in Motion

Thomas A. Berrueta, Ana Pervan, Kathleen Fitzsimons et al.

Motions carry information about the underlying task being executed. Previous work in human motion analysis suggests that complex motions may result from the composition of fundamental submovements called movemes. The existence of finite structure in motion motivates information-theoretic approaches to motion analysis and robotic assistance. We define task embodiment as the amount of task information encoded in an agent's motions. By decoding task-specific information embedded in motion, we can use task embodiment to create detailed performance assessments. We extract an alphabet of behaviors comprising a motion without \textit{a priori} knowledge using a novel algorithm, which we call dynamical system segmentation. For a given task, we specify an optimal agent, and compute an alphabet of behaviors representative of the task. We identify these behaviors in data from agent executions, and compare their relative frequencies against that of the optimal agent using the Kullback-Leibler divergence. We validate this approach using a dataset of human subjects (n=53) performing a dynamic task, and under this measure find that individuals receiving assistance better embody the task. Moreover, we find that task embodiment is a better predictor of assistance than integrated mean-squared-error.

RONov 30, 2020
Learning from Human Directional Corrections

Wanxin Jin, Todd D. Murphey, Zehui Lu et al.

This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections. Existing methods learn from human magnitude corrections; since a human needs to carefully choose the magnitude of each correction, those methods can easily lead to over-corrections and learning inefficiency. The proposed method only requires human directional corrections -- corrections that only indicate the direction of an input change without indicating its magnitude. We only assume that each correction, regardless of its magnitude, points in a direction that improves the robot's current motion relative to an unknown objective function. The allowable corrections satisfying this assumption account for half of the input space, as opposed to the magnitude corrections which have to lie in a shrinking level set. For each directional correction, the proposed method updates the estimate of the objective function based on a cutting plane method, which has a geometric interpretation. We have established theoretical results to show the convergence of the learning process. The proposed method has been tested in numerical examples, a user study on two human-robot games, and a real-world quadrotor experiment. The results confirm the convergence of the proposed method and further show that the method is significantly more effective (higher success rate), efficient/effortless (less human corrections needed), and potentially more accessible (fewer early wasted trials) than the state-of-the-art robot learning frameworks.

ROOct 22, 2020
Dynamics and Domain Randomized Gait Modulation with Bezier Curves for Sim-to-Real Legged Locomotion

Maurice Rahme, Ian Abraham, Matthew L. Elwin et al.

We present a sim-to-real framework that uses dynamics and domain randomized offline reinforcement learning to enhance open-loop gaits for legged robots, allowing them to traverse uneven terrain without sensing foot impacts. Our approach, D$^2$-Randomized Gait Modulation with Bezier Curves (D$^2$-GMBC), uses augmented random search with randomized dynamics and terrain to train, in simulation, a policy that modifies the parameters and output of an open-loop Bezier curve gait generator for quadrupedal robots. The policy, using only inertial measurements, enables the robot to traverse unknown rough terrain, even when the robot's physical parameters do not match the open-loop model. We compare the resulting policy to hand-tuned Bezier Curve gaits and to policies trained without randomization, both in simulation and on a real quadrupedal robot. With D$^2$-GMBC, across a variety of experiments on unobserved and unknown uneven terrain, the robot walks significantly farther than with either hand-tuned gaits or gaits learned without domain randomization. Additionally, using D$^2$-GMBC, the robot can walk laterally and rotate while on the rough terrain, even though it was trained only for forward walking.

MLOct 12, 2020
Derivative-Based Koopman Operators for Real-Time Control of Robotic Systems

Giorgos Mamakoukas, Maria L. Castano, Xiaobo Tan et al.

This paper presents a generalizable methodology for data-driven identification of nonlinear dynamics that bounds the model error in terms of the prediction horizon and the magnitude of the derivatives of the system states. Using higher-order derivatives of general nonlinear dynamics that need not be known, we construct a Koopman operator-based linear representation and utilize Taylor series accuracy analysis to derive an error bound. The resulting error formula is used to choose the order of derivatives in the basis functions and obtain a data-driven Koopman model using a closed-form expression that can be computed in real time. Using the inverted pendulum system, we illustrate the robustness of the error bounds given noisy measurements of unknown dynamics, where the derivatives are estimated numerically. When combined with control, the Koopman representation of the nonlinear system has marginally better performance than competing nonlinear modeling methods, such as SINDy and NARX. In addition, as a linear model, the Koopman approach lends itself readily to efficient control design tools, such as LQR, whereas the other modeling approaches require nonlinear control methods. The efficacy of the approach is further demonstrated with simulation and experimental results on the control of a tail-actuated robotic fish. Experimental results show that the proposed data-driven control approach outperforms a tuned PID (Proportional Integral Derivative) controller and that updating the data-driven model online significantly improves performance in the presence of unmodeled fluid disturbance. This paper is complemented with a video: https://youtu.be/9_wx0tdDta0.

ROAug 5, 2020
Learning from Sparse Demonstrations

Wanxin Jin, Todd D. Murphey, Dana Kulić et al.

This paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot to learn an objective function from a few sparsely demonstrated keyframes. The keyframes, labeled with some time stamps, are the desired task-space outputs, which a robot is expected to follow sequentially. The time stamps of the keyframes can be different from the time of the robot's actual execution. The method jointly finds an objective function and a time-warping function such that the robot's resulting trajectory sequentially follows the keyframes with minimal discrepancy loss. The Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of the robot trajectory with respect to the unknown parameters. The method is first evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to learn an objective function for motion planning in unmodeled environments. The results show the efficiency of the method, its ability to handle time misalignment between keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.

ROJul 17, 2020
Information Requirements of Collision-Based Micromanipulation

Alexandra Q. Nilles, Ana Pervan, Thomas A. Berrueta et al.

We present a task-centered formal analysis of the relative power of several robot designs, inspired by the unique properties and constraints of micro-scale robotic systems. Our task of interest is object manipulation because it is a fundamental prerequisite for more complex applications such as micro-scale assembly or cell manipulation. Motivated by the difficulty in observing and controlling agents at the micro-scale, we focus on the design of boundary interactions: the robot's motion strategy when it collides with objects or the environment boundary, otherwise known as a bounce rule. We present minimal conditions on the sensing, memory, and actuation requirements of periodic ``bouncing'' robot trajectories that move an object in a desired direction through the incidental forces arising from robot-object collisions. Using an information space framework and a hierarchical controller, we compare several robot designs, emphasizing the information requirements of goal completion under different initial conditions, as well as what is required to recognize irreparable task failure. Finally, we present a physically-motivated model of boundary interactions, and analyze the robustness and dynamical properties of resulting trajectories.

ROJun 12, 2020
Algorithmic Design for Embodied Intelligence in Synthetic Cells

Ana Pervan, Todd D. Murphey

In nature, biological organisms jointly evolve both their morphology and their neurological capabilities to improve their chances for survival. Consequently, task information is encoded in both their brains and their bodies. In robotics, the development of complex control and planning algorithms often bears sole responsibility for improving task performance. This dependence on centralized control can be problematic for systems with computational limitations, such as mechanical systems and robots on the microscale. In these cases we need to be able to offload complex computation onto the physical morphology of the system. To this end, we introduce a methodology for algorithmically arranging sensing and actuation components into a robot design while maintaining a low level of design complexity (quantified using a measure of graph entropy), and a high level of task embodiment (evaluated by analyzing the Kullback-Leibler divergence between physical executions of the robot and those of an idealized system). This approach computes an idealized, unconstrained control policy which is projected onto a limited selection of sensors and actuators in a given library, resulting in intelligence that is distributed away from a central processor and instead embodied in the physical body of a robot. The method is demonstrated by computationally optimizing a simulated synthetic cell.

ROJun 5, 2020
Hybrid Control for Learning Motor Skills

Ian Abraham, Alexander Broad, Allison Pinosky et al.

We develop a hybrid control approach for robot learning based on combining learned predictive models with experience-based state-action policy mappings to improve the learning capabilities of robotic systems. Predictive models provide an understanding of the task and the physics (which improves sample-efficiency), while experience-based policy mappings are treated as "muscle memory" that encode favorable actions as experiences that override planned actions. Hybrid control tools are used to create an algorithmic approach for combining learned predictive models with experience-based learning. Hybrid learning is presented as a method for efficiently learning motor skills by systematically combining and improving the performance of predictive models and experience-based policies. A deterministic variation of hybrid learning is derived and extended into a stochastic implementation that relaxes some of the key assumptions in the original derivation. Each variation is tested on experience-based learning methods (where the robot interacts with the environment to gain experience) as well as imitation learning methods (where experience is provided through demonstrations and tested in the environment). The results show that our method is capable of improving the performance and sample-efficiency of learning motor skills in a variety of experimental domains.

ROJun 5, 2020
An Ergodic Measure for Active Learning From Equilibrium

Ian Abraham, Ahalya Prabhakar, Todd D. Murphey

This paper develops KL-Ergodic Exploration from Equilibrium ($\text{KL-E}^3$), a method for robotic systems to integrate stability into actively generating informative measurements through ergodic exploration. Ergodic exploration enables robotic systems to indirectly sample from informative spatial distributions globally, avoiding local optima, and without the need to evaluate the derivatives of the distribution against the robot dynamics. Using hybrid systems theory, we derive a controller that allows a robot to exploit equilibrium policies (i.e., policies that solve a task) while allowing the robot to explore and generate informative data using an ergodic measure that can extend to high-dimensional states. We show that our method is able to maintain Lyapunov attractiveness with respect to the equilibrium task while actively generating data for learning tasks such, as Bayesian optimization, model learning, and off-policy reinforcement learning. In each example, we show that our proposed method is capable of generating an informative distribution of data while synthesizing smooth control signals. We illustrate these examples using simulated systems and provide simplification of our method for real-time online learning in robotic systems.

ROJun 4, 2020
Model-Based Generalization Under Parameter Uncertainty Using Path Integral Control

Ian Abraham, Ankur Handa, Nathan Ratliff et al.

This work addresses the problem of robot interaction in complex environments where online control and adaptation is necessary. By expanding the sample space in the free energy formulation of path integral control, we derive a natural extension to the path integral control that embeds uncertainty into action and provides robustness for model-based robot planning. Our algorithm is applied to a diverse set of tasks using different robots and validate our results in simulation and real-world experiments. We further show that our method is capable of running in real-time without loss of performance. Videos of the experiments as well as additional implementation details can be found at https://sites.google.com/view/emppi.

ROMay 8, 2020
Learning Stable Models for Prediction and Control

Giorgos Mamakoukas, Ian Abraham, Todd D. Murphey

This paper demonstrates the benefits of imposing stability on data-driven Koopman operators. The data-driven identification of stable Koopman operators (DISKO) is implemented using an algorithm \cite{mamakoukas_stableLDS2020} that computes the nearest \textit{stable} matrix solution to a least-squares reconstruction error. As a first result, we derive a formula that describes the prediction error of Koopman representations for an arbitrary number of time steps, and which shows that stability constraints can improve the predictive accuracy over long horizons. As a second result, we determine formal conditions on basis functions of Koopman operators needed to satisfy the stability properties of an underlying nonlinear system. As a third result, we derive formal conditions for constructing Lyapunov functions for nonlinear systems out of stable data-driven Koopman operators, which we use to verify stabilizing control from data. Lastly, we demonstrate the benefits of DISKO in prediction and control with simulations using a pendulum and a quadrotor and experiments with a pusher-slider system. The paper is complemented with a video: \url{https://sites.google.com/view/learning-stable-koopman}.

ROJun 12, 2019
Active Learning of Dynamics for Data-Driven Control Using Koopman Operators

Ian Abraham, Todd D. Murphey

This paper presents an active learning strategy for robotic systems that takes into account task information, enables fast learning, and allows control to be readily synthesized by taking advantage of the Koopman operator representation. We first motivate the use of representing nonlinear systems as linear Koopman operator systems by illustrating the improved model-based control performance with an actuated Van der Pol system. Information-theoretic methods are then applied to the Koopman operator formulation of dynamical systems where we derive a controller for active learning of robot dynamics. The active learning controller is shown to increase the rate of information about the Koopman operator. In addition, our active learning controller can readily incorporate policies built on the Koopman dynamics, enabling the benefits of fast active learning and improved control. Results using a quadcopter illustrate single-execution active learning and stabilization capabilities during free-fall. The results for active learning are extended for automating Koopman observables and we implement our method on real robotic systems.

ROFeb 8, 2019
Active Area Coverage from Equilibrium

Ian Abraham, Ahalya Prabhakar, Todd D. Murphey

This paper develops a method for robots to integrate stability into actively seeking out informative measurements through coverage. We derive a controller using hybrid systems theory that allows us to consider safe equilibrium policies during active data collection. We show that our method is able to maintain Lyapunov attractiveness while still actively seeking out data. Using incremental sparse Gaussian processes, we define distributions which allow a robot to actively seek out informative measurements. We illustrate our methods for shape estimation using a cart double pendulum, dynamic model learning of a hovering quadrotor, and generating galloping gaits starting from stationary equilibrium by learning a dynamics model for the half-cheetah system from the Roboschool environment.

ROJun 13, 2018
Decentralized Ergodic Control: Distribution-Driven Sensing and Exploration for Multi-Agent Systems

Ian Abraham, Todd D. Murphey

We present a decentralized ergodic control policy for time-varying area coverage problems for multiple agents with nonlinear dynamics. Ergodic control allows us to specify distributions as objectives for area coverage problems for nonlinear robotic systems as a closed-form controller. We derive a variation to the ergodic control policy that can be used with consensus to enable a fully decentralized multi-agent control policy. Examples are presented to illustrate the applicability of our method for multi-agent terrain mapping as well as target localization. An analysis on ergodic policies as a Nash equilibrium is provided for game theoretic applications.

ROMay 31, 2018
Data-Driven Measurement Models for Active Localization in Sparse Environments

Ian Abraham, Anastasia Mavrommati, Todd D. Murphey

We develop an algorithm to explore an environment to generate a measurement model for use in future localization tasks. Ergodic exploration with respect to the likelihood of a particular class of measurement (e.g., a contact detection measurement in tactile sensing) enables construction of the measurement model. Exploration with respect to the information density based on the data-driven measurement model enables localization. We test the two-stage approach in simulations of tactile sensing, illustrating that the algorithm is capable of identifying and localizing objects based on sparsely distributed binary contacts. Comparisons with our method show that visiting low probability regions lead to acquisition of new information rather than increasing the likelihood of known information. Experiments with the Sphero SPRK robot validate the efficacy of this method for collision-based estimation and localization of the environment.

ROApr 24, 2018
Feedback Synthesis For Underactuated Systems Using Sequential Second-Order Needle Variations

Giorgos Mamakoukas, Malcolm A. MacIver, Todd D. Murphey

This paper derives nonlinear feedback control synthesis for general control affine systems using second-order actions---the second-order needle variations of optimal control---as the basis for choosing each control response to the current state. A second result of the paper is that the method provably exploits the nonlinear controllability of a system by virtue of an explicit dependence of the second-order needle variation on the Lie bracket between vector fields. As a result, each control decision necessarily decreases the objective when the system is nonlinearly controllable using first-order Lie brackets. Simulation results using a differential drive cart, an underactuated kinematic vehicle in three dimensions, and an underactuated dynamic model of an underwater vehicle demonstrate that the method finds control solutions when the first-order analysis is singular. Lastly, the underactuated dynamic underwater vehicle model demonstrates convergence even in the presence of a velocity field.

ROSep 11, 2017
Dynamic Task Execution using Active Parameter Identification with the Baxter Research Robot

Andrew D. Wilson, Jarvis A. Schultz, Alex R. Ansari et al.

This paper presents experimental results from real-time parameter estimation of a system model and subsequent trajectory optimization for a dynamic task using the Baxter Research Robot from Rethink Robotics. An active estimator maximizing Fisher information is used in real-time with a closed-loop, non-linear control technique known as Sequential Action Control. Baxter is tasked with estimating the length of a string connected to a load suspended from the gripper with a load cell providing the single source of feedback to the estimator. Following the active estimation, a trajectory is generated using the trep software package that controls Baxter to dynamically swing a suspended load into a box. Several trials are presented with varying initial estimates showing that estimation is required to obtain adequate open-loop trajectories to complete the prescribed task. The result of one trial with and without the active estimation is also shown in the accompanying video.

ROSep 11, 2017
Trajectory Synthesis for Fisher Information Maximization

Andrew D. Wilson, Jarvis A. Schultz, Todd D. Murphey

Estimation of model parameters in a dynamic system can be significantly improved with the choice of experimental trajectory. For general, nonlinear dynamic systems, finding globally "best" trajectories is typically not feasible; however, given an initial estimate of the model parameters and an initial trajectory, we present a continuous-time optimization method that produces a locally optimal trajectory for parameter estimation in the presence of measurement noise. The optimization algorithm is formulated to find system trajectories that improve a norm on the Fisher information matrix. A double-pendulum cart apparatus is used to numerically and experimentally validate this technique. In simulation, the optimized trajectory increases the minimum eigenvalue of the Fisher information matrix by three orders of magnitude compared to the initial trajectory. Experimental results show that this optimized trajectory translates to an order of magnitude improvement in the parameter estimate error in practice.

OCSep 6, 2017
Feedback Synthesis for Controllable Underactuated Systems using Sequential Second Order Actions

Giorgos Mamakoukas, Malcolm A. MacIver, Todd D. Murphey

This paper derives nonlinear feedback control synthesis for general control affine systems using second-order actions---the needle variations of optimal control---as the basis for choosing each control response to the current state. A second result of the paper is that the method provably exploits the nonlinear controllability of a system by virtue of an explicit dependence of the second-order needle variation on the Lie bracket between vector fields. As a result, each control decision necessarily decreases the objective when the system is nonlinearly controllable using first-order Lie brackets. Simulation results using a differential drive cart, an underactuated kinematic vehicle in three dimensions, and an underactuated dynamic model of an underwater vehicle demonstrate that the method finds control solutions when the first-order analysis is singular. Moreover, the simulated examples demonstrate superior convergence when compared to synthesis based on first-order needle variations. Lastly, the underactuated dynamic underwater vehicle model demonstrates the convergence even in the presence of a velocity field.

ROSep 5, 2017
Model-Based Control Using Koopman Operators

Ian Abraham, Gerardo De La Torre, Todd D. Murphey

This paper explores the application of Koopman operator theory to the control of robotic systems. The operator is introduced as a method to generate data-driven models that have utility for model-based control methods. We then motivate the use of the Koopman operator towards augmenting model-based control. Specifically, we illustrate how the operator can be used to obtain a linearizable data-driven model for an unknown dynamical process that is useful for model-based control synthesis. Simulated results show that with increasing complexity in the choice of the basis functions, a closed-loop controller is able to invert and stabilize a cart- and VTOL-pendulum systems. Furthermore, the specification of the basis function are shown to be of importance when generating a Koopman operator for specific robotic systems. Experimental results with the Sphero SPRK robot explore the utility of the Koopman operator in a reduced state representation setting where increased complexity in the basis function improve open- and closed-loop controller performance in various terrains, including sand.

ROSep 5, 2017
Ergodic Exploration using Binary Sensing for Non-Parametric Shape Estimation

Ian Abraham, Ahalya Prabhakar, Mitra J. Z. Hartmann et al.

Current methods to estimate object shape---using either vision or touch---generally depend on high-resolution sensing. Here, we exploit ergodic exploration to demonstrate successful shape estimation when using a low-resolution binary contact sensor. The measurement model is posed as a collision-based tactile measurement, and classification methods are used to discriminate between shape boundary regions in the search space. Posterior likelihood estimates of the measurement model help the system actively seek out regions where the binary sensor is most likely to return informative measurements. Results show successful shape estimation of various objects as well as the ability to identify multiple objects in an environment. Interestingly, it is shown that ergodic exploration utilizes non-contact motion to gather significant information about shape. The algorithm is extended in three dimensions in simulation and we present two dimensional experimental results using the Rethink Baxter robot.

OCAug 31, 2017
Real-time Dynamic-Mode Scheduling Using Single-Integration Hybrid Optimization for Linear Time-Varying Systems

Anastasia Mavrommati, Jarvis A. Schultz, Todd D. Murphey

This paper considers the problem of real-time mode scheduling in linear time-varying switched systems subject to a quadratic cost functional. The execution time of hybrid control algorithms is often prohibitive for real-time applications and typically may only be reduced at the expense of approximation accuracy. We address this trade-off by taking advantage of system linearity to formulate a projection-based approach so that no simulation is required during open-loop optimization. A numerical example shows how the proposed open-loop algorithm outperforms methods employing common numerical integration techniques. Additionally, we follow a receding-horizon scheme to apply real-time closed-loop hybrid control to a customized experimental setup, using the Robot Operating System (ROS). In particular, we demonstrate---both in Monte-Carlo simulation and in experiment---that optimal hybrid control efficiently regulates a cart and suspended mass system in real time.

ROAug 30, 2017
Ergodic Exploration of Distributed Information

Lauren M. Miller, Yonatan Silverman, Malcolm A. MacIver et al.

This paper presents an active search trajectory synthesis technique for autonomous mobile robots with nonlinear measurements and dynamics. The presented approach uses the ergodicity of a planned trajectory with respect to an expected information density map to close the loop during search. The ergodic control algorithm does not rely on discretization of the search or action spaces, and is well posed for coverage with respect to the expected information density whether the information is diffuse or localized, thus trading off between exploration and exploitation in a single objective function. As a demonstration, we use a robotic electrolocation platform to estimate location and size parameters describing static targets in an underwater environment. Our results demonstrate that the ergodic exploration of distributed information (EEDI) algorithm outperforms commonly used information-oriented controllers, particularly when distractions are present.

ROAug 28, 2017
Real-Time Area Coverage and Target Localization using Receding-Horizon Ergodic Exploration

Anastasia Mavrommati, Emmanouil Tzorakoleftherakis, Ian Abraham et al.

Although a number of solutions exist for the problems of coverage, search and target localization---commonly addressed separately---whether there exists a unified strategy that addresses these objectives in a coherent manner without being application-specific remains a largely open research question. In this paper, we develop a receding-horizon ergodic control approach, based on hybrid systems theory, that has the potential to fill this gap. The nonlinear model predictive control algorithm plans real-time motions that optimally improve ergodicity with respect to a distribution defined by the expected information density across the sensing domain. We establish a theoretical framework for global stability guarantees with respect to a distribution. Moreover, the approach is distributable across multiple agents, so that each agent can independently compute its own control while sharing statistics of its coverage across a communication network. We demonstrate the method in both simulation and in experiment in the context of target localization, illustrating that the algorithm is independent of the number of targets being tracked and can be run in real-time on computationally limited hardware platforms.