Luis Sentis

RO
h-index36
56papers
535citations
Novelty45%
AI Score54

56 Papers

ROMay 28
ARISTO Hand: Sensing-Driven Distal Hyperextension for Fine-Grained Manipulation

Aaron Kim, Dong Ho Kang, Mark Helwig et al.

Manipulating thin objects requires precise contact geometry and reliable force perception, yet many anthropomorphic robotic hands lack the mechanical and sensing capabilities needed for such interactions. We present the ARISTO Hand, a tendon-driven robotic hand that integrates active distal hyperextension with a hybrid fingertip-sensing architecture that combines a rigid, nail-mounted force-torque sensor and a soft capacitive tactile array. Active hyperextension enables controlled fingertip engagement beyond the kinematic limits of standard flexion, increasing pull-out force by 2.76x for object thicknesses of 1-20 mm while preserving the nominal grasp capability. The rigid nail-mounted sensor provides reliable force measurements during edge contacts, where the sensitivity of proprioceptive force estimation degrades as the contact geometry approaches kinematic singularities. We validate the proposed architecture through quantitative force characterization and a multi-stage SD card extraction and insertion task. Video and supplementary materials are available at: https://aristohand.github.io

ROJun 3
Too Much of a Good Thing: When sim2real Efforts Impede Policy Learning (And What to Do About It)

Kyle Morgenstein, Bharath Masetty, Stephen Welch et al.

While sim2real efforts are necessary for effective policy transfer to hardware, there is such a thing as too much of a good thing. We argue that sim2real efforts have led to misaligned incentives with policy learning, resulting in simulator lock in and poor policy exploration due to the unreasonable constraints imposed by the real world. We offer a diagnosis and explanation of the current status of the problem, and propose a potential solution via a sim2sim2real paradigm that leverages the robot's kinematics as the sole design constraint.

ROSep 19, 2022
Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments

Mingyo Seo, Ryan Gupta, Yifeng Zhu et al.

We tackle the problem of perceptive locomotion in dynamic environments. In this problem, a quadrupedal robot must exhibit robust and agile walking behaviors in response to environmental clutter and moving obstacles. We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision-making to predict navigation commands and low-level gait generation to realize the target commands. In this framework, we train the high-level navigation controller with imitation learning on human demonstrations collected on a steerable cart and the low-level gait controller with reinforcement learning (RL). Therefore, our method can acquire complex navigation behaviors from human supervision and discover versatile gaits from trial and error. We demonstrate the effectiveness of our approach in simulation and with hardware experiments. Videos and code can be found at the project page: https://ut-austin-rpl.github.io/PRELUDE.

ROMay 18
PLATO Hand: Shaping Contact Behavior with Fingernails for Precise Manipulation

Dong Ho Kang, Aaron Kim, Mingyo Seo et al.

We present the PLATO Hand, a dexterous robotic hand with a hybrid fingertip that combines a rigid fingernail, embedded distal phalanx, and compliant pulp to shape contact behavior during manipulation. \rrev{By mechanically organizing how contact is initiated, supported, and transmitted at the fingertip, this structure creates stable and task-relevant contact conditions across diverse object geometries and grasp orientations.} We develop a strain-energy-based bending--indentation model to guide the fingertip design and to explain how material stiffness and contact geometry govern deformation partitioning within the fingertip. \rrev{Experiments show improved pinch stability, improved fingernail-mediated dorsal-contact force transmission and proprioceptive observability}, and successful execution of edge-sensitive manipulation tasks, including paper singulation, card picking, and orange peeling. These results show that coupling a mechanically structured contact interface with a force-motion-transparent finger mechanism provides a principled approach to precise manipulation. Our project page is at: https://platohand.github.io

SYNov 19, 2019
Modeling and Loop Shaping of Single-Joint Amplification Exoskeleton with Contact Sensing and Series Elastic Actuation

Binghan He, Gray C. Thomas, Nicholas Paine et al.

In this paper we consider a class of exoskeletons designed to amplify the strength of humans through feedback of sensed human-robot interactions and actuator forces. We define an amplification error signal based on a reference amplification rate, and design a linear feedback compensator to attenuate this error. Since the human operator is an integral part of the system, we design the compensator to be robust to both a realistic variation in human impedance and a large variation in load impedance. We demonstrate our strategy on a one-degree of freedom amplification exoskeleton connected to a human arm, following a three dimensional matrix of experimentation: slow or fast human motion; light or extreme exoskeleton load; and soft or clenched human arm impedances. We demonstrate that a slightly aggressive controller results in a borderline stable system---but only for soft human musculoeskeletal behavior and a heavy load. This class of exoskeleton systems is interesting because it can both amplify a human's interaction forces --- so long as the human contacts the environment through the exoskeleton --- and attenuate the operator's perception of the exoskeleton's reflected dynamics at frequencies within the bandwidth of the control.

SYMar 8, 2018
Investigations of a Robotic Testbed with Viscoelastic Liquid Cooled Actuators

Donghyun Kim, Junhyeok Ahn, Orion Campbell et al.

We design, build, and thoroughly test a new type of actuator dubbed viscoelastic liquid cooled actuator (VLCA) for robotic applications. VLCAs excel in the following five critical axes of performance: energy efficiency, torque density, impact resistence, joint position and force controllability. We first study the design objectives and choices of the VLCA to enhance the performance on the needed criteria. We follow by an investigation on viscoelastic materials in terms of their damping, viscous and hysteresis properties as well as parameters related to the long- term performance. As part of the actuator design, we configure a disturbance observer to provide high-fidelity force control to enable a wide range of impedance control capabilities. We proceed to design a robotic system capable to lift payloads of 32.5 kg, which is three times larger than its own weight. In addition, we experiment with Cartesian trajectory control up to 2 Hz with a vertical range of motion of 32 cm while carrying a payload of 10 kg. Finally, we perform experiments on impedance control and mechanical robustness by studying the response of the robotics testbed to hammering impacts and external force interactions.

OCNov 20, 2019
Safety Control Synthesis with Input Limits: a Hybrid Approach

Gray C. Thomas, Binghan He, Luis Sentis

We introduce a hybrid (discrete--continuous) safety controller which enforces strict state and input constraints on a system---but only acts when necessary, preserving transparent operation of the original system within some safe region of the state space. We define this space using a Min-Quadratic Barrier function, which we construct along the equilibrium manifold using the Lyapunov functions which result from linear matrix inequality controller synthesis for locally valid uncertain linearizations. We also introduce the concept of a barrier pair, which makes it easy to extend the approach to include trajectory-based augmentations to the safe region, in the style of LQR-Trees. We demonstrate our controller and barrier pair synthesis method in simulation-based examples.

ROSep 26, 2024
HARMONIC: Cognitive and Control Collaboration in Human-Robotic Teams

Sanjay Oruganti, Sergei Nirenburg, Marjorie McShane et al.

This paper describes HARMONIC, a cognitive-robotic architecture that integrates the OntoAgent cognitive framework with general-purpose robot control systems applied to human-robot teaming (HRT). HARMONIC incorporates metacognition, meaningful natural language communication, and explainability capabilities required for developing mutual trust in HRT. Through simulation experiments involving a joint search task performed by a heterogeneous team of two HARMONIC-based robots and a human operator, we demonstrate heterogeneous robots that coordinate their actions, adapt to complex scenarios, and engage in natural human-robot communication. Evaluation results show that HARMONIC-based robots can reason about plans, goals, and team member attitudes while providing clear explanations for their decisions, which are essential requirements for realistic human-robot teaming.

ROOct 13, 2022
Sample Efficient Dynamics Learning for Symmetrical Legged Robots:Leveraging Physics Invariance and Geometric Symmetries

Jee-eun Lee, Jaemin Lee, Tirthankar Bandyopadhyay et al.

Model generalization of the underlying dynamics is critical for achieving data efficiency when learning for robot control. This paper proposes a novel approach for learning dynamics leveraging the symmetry in the underlying robotic system, which allows for robust extrapolation from fewer samples. Existing frameworks that represent all data in vector space fail to consider the structured information of the robot, such as leg symmetry, rotational symmetry, and physics invariance. As a result, these schemes require vast amounts of training data to learn the system's redundant elements because they are learned independently. Instead, we propose considering the geometric prior by representing the system in symmetrical object groups and designing neural network architecture to assess invariance and equivariance between the objects. Finally, we demonstrate the effectiveness of our approach by comparing the generalization to unseen data of the proposed model and the existing models. We also implement a controller of a climbing robot based on learned inverse dynamics models. The results show that our method generates accurate control inputs that help the robot reach the desired state while requiring less training data than existing methods.

OCJan 4, 2019
Quadric Inclusion Programs: an LMI Approach to H[infinity]-Model Identification

Gray C. Thomas, Luis Sentis

Practical application of H[infinity] robust control relies on system identification of a valid model-set, described by a linear system in feedback with a stable norm-bounded uncertainty, which must explains all possible (or at least all previously measured) behavior for the control plant. Such models can be viewed as norm-bounded inclusions in the frequency domain, and this note introduces the "Quadric Inclusion Program" that can identify inclusions from input--output data as a convex problem. We prove several key properties of this algorithm and give a geometric interpretation for its behavior. While we stress that the inclusion fitting is outlier-sensitive by design, we offer a method to mitigate the effect of measurement noise. We apply this method to robustly approximate simulated frequency domain data using orthonormal basis functions. The result compares favorably with a least squares approach that satisfies the same data inclusion requirements.

ROMar 20
Why Cognitive Robotics Matters: Lessons from OntoAgent and LLM Deployment in HARMONIC for Safety-Critical Robot Teaming

Sanjay Oruganti, Sergei Nirenburg, Marjorie McShane et al.

Deploying embodied AI agents in the physical world demands cognitive capabilities for long-horizon planning that execute reliably, deterministically, and transparently. We present HARMONIC, a cognitive-robotic architecture that pairs OntoAgent, a content-centric cognitive architecture providing metacognitive self-monitoring, domain-grounded diagnosis, and consequence-based action selection over ontologically structured knowledge, with a modular reactive tactical layer. HARMONIC's modular design enables a functional evaluation of whether LLMs can replicate OntoAgent's cognitive capabilities, evaluated within the same robotic system under identical conditions. Six LLMs spanning frontier and efficient tiers replace OntoAgent in a collaborative maintenance scenario under native and knowledge-equalized conditions. Results reveal that LLMs do not consistently assess their own knowledge state before acting, causing downstream failures in diagnostic reasoning and action selection. These deficits persist even with equivalent procedural knowledge, indicating the issues are architectural rather than knowledge-based. These findings support the design of physically embodied systems in which cognitive architectures retain primary authority for reasoning, owing to their deterministic and transparent characteristics.

LGSep 20, 2023
Symbolic Regression on Sparse and Noisy Data with Gaussian Processes

Junette Hsin, Shubhankar Agarwal, Adam Thorpe et al.

In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our approach GPSINDy offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on simulation data from Lotka-Volterra and unicycle models and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including more than 50% improvement over SINDy and other baselines in predicting future trajectories from noise-corrupted and sparse 5 Hz data.

ROMar 2, 2023
Learning Contact-based Navigation in Crowds

Kyle Morgenstein, Junfeng Jiao, Luis Sentis

Navigation strategies that intentionally incorporate contact with humans (i.e. "contact-based" social navigation) in crowded environments are largely unexplored even though collision-free social navigation is a well studied problem. Traditional social navigation frameworks require the robot to stop suddenly or "freeze" whenever a collision is imminent. This paradigm poses two problems: 1) freezing while navigating a crowd may cause people to trip and fall over the robot, resulting in more harm than the collision itself, and 2) in very dense social environments where collisions are unavoidable, such a control scheme would render the robot unable to move and preclude the opportunity to study how humans incorporate robots into these environments. However, if robots are to be meaningfully included in crowded social spaces, such as busy streets, subways, stores, or other densely populated locales, there may not exist trajectories that can guarantee zero collisions. Thus, adoption of robots in these environments requires the development of minimally disruptive navigation plans that can safely plan for and respond to contacts. We propose a learning-based motion planner and control scheme to navigate dense social environments using safe contacts for an omnidirectional mobile robot. The planner is evaluated in simulation over 360 trials with crowd densities varying between 0.0 and 1.6 people per square meter. Our navigation scheme is able to use contact to safely navigate in crowds of higher density than has been previously reported, to our knowledge.

ROSep 16, 2025
HARMONIC: A Content-Centric Cognitive Robotic Architecture

Sanjay Oruganti, Sergei Nirenburg, Marjorie McShane et al.

This paper introduces HARMONIC, a cognitive-robotic architecture designed for robots in human-robotic teams. HARMONIC supports semantic perception interpretation, human-like decision-making, and intentional language communication. It addresses the issues of safety and quality of results; aims to solve problems of data scarcity, explainability, and safety; and promotes transparency and trust. Two proof-of-concept HARMONIC-based robotic systems are demonstrated, each implemented in both a high-fidelity simulation environment and on physical robotic platforms.

ROFeb 24, 2022
Data-Driven Safety Verification for Legged Robots

Junhyeok Ahn, Seung Hyeon Bang, Carlos Gonzalez et al.

Planning safe motions for legged robots requires sophisticated safety verification tools. However, designing such tools for such complex systems is challenging due to the nonlinear and high-dimensional nature of these systems' dynamics. In this letter, we present a probabilistic verification framework for legged systems, which evaluates the safety of planned trajectories by learning an assessment function from trajectories collected from a closed-loop system. Our approach does not require an analytic expression of the closed-loop dynamics, thus enabling safety verification of systems with complex models and controllers. Our framework consists of an offline stage that initializes a safety assessment function by simulating a nominal model and an online stage that adapts the function to address the sim-to-real gap. The performance of the proposed approach for safety verification is demonstrated using a quadruped balancing task and a humanoid reaching task. The results demonstrate that our framework accurately predicts the systems' safety both at the planning phase to generate robust trajectories and at execution phase to detect unexpected external disturbances.

RODec 1, 2021
A Barrier Pair Method for Safe Human-Robot Shared Autonomy

Binghan He, Mahsa Ghasemi, Ufuk Topcu et al.

Shared autonomy provides a framework where a human and an automated system, such as a robot, jointly control the system's behavior, enabling an effective solution for various applications, including human-robot interaction. However, a challenging problem in shared autonomy is safety because the human input may be unknown and unpredictable, which affects the robot's safety constraints. If the human input is a force applied through physical contact with the robot, it also alters the robot's behavior to maintain safety. We address the safety issue of shared autonomy in real-time applications by proposing a two-layer control framework. In the first layer, we use the history of human input measurements to infer what the human wants the robot to do and define the robot's safety constraints according to that inference. In the second layer, we formulate a rapidly-exploring random tree of barrier pairs, with each barrier pair composed of a barrier function and a controller. Using the controllers in these barrier pairs, the robot is able to maintain its safe operation under the intervention from the human input. This proposed control framework allows the robot to assist the human while preventing them from encountering safety issues. We demonstrate the proposed control framework on a simulation of a two-linkage manipulator robot.

ROJul 27, 2021
Information-Theoretic Based Target Search with Multiple Agents

Minkyu Kim, Ryan Gupta, Luis Sentis

This paper proposes an online path planning and motion generation algorithm for heterogeneous robot teams performing target search in a real-world environment. Path selection for each robot is optimized using an information-theoretic formulation and is computed sequentially for each agent. First, we generate candidate trajectories sampled from both global waypoints derived from vertical cell decomposition and local frontier points. From this set, we choose the path with maximum information gain. We demonstrate that the hierarchical sequential decision-making structure provided by the algorithm is scalable to multiple agents in a simulation setup. We also validate our framework in a real-world apartment setting using a two robot team comprised of the Unitree A1 quadruped and the Toyota HSR mobile manipulator searching for a person. The agents leverage an efficient leader-follower communication structure where only critical information is shared.

LGNov 20, 2020
Nested Mixture of Experts: Cooperative and Competitive Learning of Hybrid Dynamical System

Junhyeok Ahn, Luis Sentis

Model-based reinforcement learning (MBRL) algorithms can attain significant sample efficiency but require an appropriate network structure to represent system dynamics. Current approaches include white-box modeling using analytic parameterizations and black-box modeling using deep neural networks. However, both can suffer from a bias-variance trade-off in the learning process, and neither provides a structured method for injecting domain knowledge into the network. As an alternative, gray-box modeling leverages prior knowledge in neural network training but only for simple systems. In this paper, we devise a nested mixture of experts (NMOE) for representing and learning hybrid dynamical systems. An NMOE combines both white-box and black-box models while optimizing bias-variance trade-off. Moreover, an NMOE provides a structured method for incorporating various types of prior knowledge by training the associative experts cooperatively or competitively. The prior knowledge includes information on robots' physical contacts with the environments as well as their kinematic and dynamic properties. In this paper, we demonstrate how to incorporate prior knowledge into our NMOE in various continuous control domains, including hybrid dynamical systems. We also show the effectiveness of our method in terms of data-efficiency, generalization to unseen data, and bias-variance trade-off. Finally, we evaluate our NMOE using an MBRL setup, where the model is integrated with a model-based controller and trained online.

ROOct 15, 2020
Task-Adaptive Robot Learning from Demonstration with Gaussian Process Models under Replication

Miguel Arduengo, Adrià Colomé, Júlia Borràs et al.

Learning from Demonstration (LfD) is a paradigm that allows robots to learn complex manipulation tasks that can not be easily scripted, but can be demonstrated by a human teacher. One of the challenges of LfD is to enable robots to acquire skills that can be adapted to different scenarios. In this paper, we propose to achieve this by exploiting the variations in the demonstrations to retrieve an adaptive and robust policy, using Gaussian Process (GP) models. Adaptability is enhanced by incorporating task parameters into the model, which encode different specifications within the same task. With our formulation, these parameters can be either real, integer, or categorical. Furthermore, we propose a GP design that exploits the structure of replications, i.e., repeated demonstrations with identical conditions within data. Our method significantly reduces the computational cost of model fitting in complex tasks, where replications are essential to obtain a robust model. We illustrate our approach through several experiments on a handwritten letter demonstration dataset.

ROSep 25, 2020
A Complex Stiffness Human Impedance Model with Customizable Exoskeleton Control

Binghan He, Huang Huang, Gray C. Thomas et al.

The natural impedance, or dynamic relationship between force and motion, of a human operator can determine the stability of exoskeletons that use interaction-torque feedback to amplify human strength. While human impedance is typically modelled as a linear system, our experiments on a single-joint exoskeleton testbed involving 10 human subjects show evidence of nonlinear behavior: a low-frequency asymptotic phase for the dynamic stiffness of the human that is different than the expected zero, and an unexpectedly consistent damping ratio as the stiffness and inertia vary. To explain these observations, this paper considers a new frequency-domain model of the human joint dynamics featuring complex value stiffness comprising a real stiffness term and a hysteretic damping term. Using a statistical F-test we show that the hysteretic damping term is not only significant but is even more significant than the linear damping term. Further analysis reveals a linear trend linking hysteretic damping and the real part of the stiffness, which allows us to simplify the complex stiffness model down to a 1-parameter system. Then, we introduce and demonstrate a customizable fractional-order controller that exploits this hysteretic damping behavior to improve strength amplification bandwidth while maintaining stability, and explore a tuning approach which ensures that this stability property is robust to muscle co-contraction for each individual.

ROSep 13, 2020
MPC-Based Hierarchical Task Space Control of Underactuated and Constrained Robots for Execution of Multiple Tasks

Jaemin Lee, Seung Hyeon Bang, Efstathios Bakolas et al.

This paper proposes an MPC-based controller to efficiently execute multiple hierarchical tasks for underactuated and constrained robotic systems. Existing task-space controllers or whole-body controllers solve instantaneous optimization problems given task trajectories and the robot plant dynamics. However, the task-space control method we propose here relies on the prediction of future state trajectories and the corresponding costs-to-go terms over a finite time-horizon for computing control commands. We employ acceleration energy error as the performance index for the optimization problem and extend it over the finite-time horizon of our MPC. Our approach employs quadratically constrained quadratic programming, which includes quadratic constraints to handle multiple hierarchical tasks, and is computationally more efficient than nonlinear MPC-based approaches that rely on nonlinear programming. We validate our approach using numerical simulations of a new type of robot manipulator system, which contains underactuated and constrained mechanical structures.

ROSep 5, 2020
BP-RRT: Barrier Pair Synthesis for Temporal Logic Motion Planning

Binghan He, Jaemin Lee, Ufuk Topcu et al.

For a nonlinear system (e.g. a robot) with its continuous state space trajectories constrained by a linear temporal logic specification, the synthesis of a low-level controller for mission execution often results in a non-convex optimization problem. We devise a new algorithm to solve this type of non-convex problems by formulating a rapidly-exploring random tree of barrier pairs, with each barrier pair composed of a quadratic barrier function and a full state feedback controller. The proposed method employs a rapid-exploring random tree to deal with the non-convex constraints and uses barrier pairs to fulfill the local convex constraints. As such, the method solves control problems fulfilling the required transitions of an automaton in order to satisfy given linear temporal logic constraints. At the same time it synthesizes locally optimal controllers in order to transition between the regions corresponding to the alphabet of the automaton. We demonstrate this new algorithm on a simulation of a two linkage manipulator robot.

ROFeb 23, 2020
Gaussian-Process-based Robot Learning from Demonstration

Miguel Arduengo, Adrià Colomé, Joan Lobo-Prat et al.

Endowed with higher levels of autonomy, robots are required to perform increasingly complex manipulation tasks. Learning from demonstration is arising as a promising paradigm for transferring skills to robots. It allows to implicitly learn task constraints from observing the motion executed by a human teacher, which can enable adaptive behavior. We present a novel Gaussian-Process-based learning from demonstration approach. This probabilistic representation allows to generalize over multiple demonstrations, and encode variability along the different phases of the task. In this paper, we address how Gaussian Processes can be used to effectively learn a policy from trajectories in task space. We also present a method to efficiently adapt the policy to fulfill new requirements, and to modulate the robot behavior as a function of task variability. This approach is illustrated through a real-world application using the TIAGo robot.

RODec 14, 2019
Active Object Tracking using Context Estimation: Handling Occlusions and Detecting Missing Targets

Minkyu Kim, Luis Sentis

When performing visual servoing or object tracking tasks, active sensor planning is essential to keep targets in sight or to relocate them when missing. In particular, when dealing with a known target missing from the sensor's field of view, we propose using prior knowledge related to contextual information to estimate its possible location. To this end, this study proposes a Dynamic Bayesian Network that uses contextual information to effectively search for targets. Monte Carlo particle filtering is employed to approximate the posterior probability of the target's state, from which uncertainty is defined. We define the robot's utility function via information-theoretic formalism as seeking the optimal action which reduces uncertainty of a task, prompting robot agents to investigate the location where the target most likely might exist. Using a context state model, we design the agent's high-level decision framework using a Partially-Observable Markov Decision Process. Based on the estimated belief state of the context via sequential observations, the robot's navigation actions are determined to conduct exploratory and detection tasks. By using this multi-modal context model, our agent can effectively handle basic dynamic events, such as obstruction of targets or their absence from the field of view. We implement and demonstrate these capabilities on a mobile robot in real-time.

ROOct 2, 2019
Deploying the NASA Valkyrie Humanoid for IED Response: An Initial Approach and Evaluation Summary

Steven Jens Jorgensen, Michael W. Lanighan, Sylvain S. Bertrand et al.

As part of a feasibility study, this paper shows the NASA Valkyrie humanoid robot performing an end-to-end improvised explosive device (IED) response task. To demonstrate and evaluate robot capabilities, sub-tasks highlight different locomotion, manipulation, and perception requirements: traversing uneven terrain, passing through a narrow passageway, opening a car door, retrieving a suspected IED, and securing the IED in a total containment vessel (TCV). For each sub-task, a description of the technical approach and the hidden challenges that were overcome during development are presented. The discussion of results, which explicitly includes existing limitations, is aimed at motivating continued research and development to enable practical deployment of humanoid robots for IED response. For instance, the data shows that operator pauses contribute to 50\% of the total completion time, which implies that further work is needed on user interfaces for increasing task completion efficiency.

ROSep 19, 2019
Finding Locomanipulation Plans Quickly in the Locomotion Constrained Manifold

Steven Jens Jorgensen, Mihir Vedantam, Ryan Gupta et al.

We present a method that finds locomanipulation plans that perform simultaneous locomotion and manipulation of objects for a desired end-effector trajectory. Key to our approach is to consider a generic locomotion constraint manifold that defines the locomotion scheme of the robot and then using this constraint manifold to search for admissible manipulation trajectories. The problem is formulated as a weighted-A* graph search whose planner output is a sequence of contact transitions and a path progression trajectory to construct the whole-body kinodynamic locomanipulation plan. We also provide a method for computing, visualizing and learning the locomanipulability region, which is used to efficiently evaluate the edge transition feasibility during the graph search. Experiments are performed on the NASA Valkyrie robot platform that utilizes a dynamic locomotion approach, called the divergent-component-of-motion (DCM), on two example locomanipulation scenarios.

ROSep 14, 2019
Solving Service Robot Tasks: UT Austin Villa@Home 2019 Team Report

Rishi Shah, Yuqian Jiang, Haresh Karnan et al.

RoboCup@Home is an international robotics competition based on domestic tasks requiring autonomous capabilities pertaining to a large variety of AI technologies. Research challenges are motivated by these tasks both at the level of individual technologies and the integration of subsystems into a fully functional, robustly autonomous system. We describe the progress made by the UT Austin Villa 2019 RoboCup@Home team which represents a significant step forward in AI-based HRI due to the breadth of tasks accomplished within a unified system. Presented are the competition tasks, component technologies they rely on, our initial approaches both to the components and their integration, and directions for future research.

ROJun 10, 2019
Data-Efficient and Safe Learning for Humanoid Locomotion Aided by a Dynamic Balancing Model

Junhyeok Ahn, Jaemin Lee, Luis Sentis

In this letter, we formulate a novel Markov Decision Process (MDP) for safe and data-efficient learning for humanoid locomotion aided by a dynamic balancing model. In our previous studies of biped locomotion, we relied on a low-dimensional robot model, commonly used in high-level Walking Pattern Generators (WPGs). However, a low-level feedback controller cannot precisely track desired footstep locations due to the discrepancies between the full order model and the simplified model. In this study, we propose mitigating this problem by complementing a WPG with reinforcement learning. More specifically, we propose a structured footstep control method consisting of a WPG, a neural network, and a safety controller. The WPG provides an analytical method that promotes efficient learning while the neural network maximizes long-term rewards, and the safety controller encourages safe exploration based on step capturability and the use of control-barrier functions. Our contributions include the following (1) a structured learning control method for locomotion, (2) a data-efficient and safe learning process to improve walking using a physics-based model, and (3) the scalability of the procedure to various types of humanoid robots and walking.

ROJun 10, 2019
Control of A High Performance Bipedal Robot using Viscoelastic Liquid Cooled Actuators

Junhyeok Ahn, Donghyun Kim, SeungHyeon Bang et al.

This paper describes the control, and evaluation of a new human-scaled biped robot with liquid cooled viscoelastic actuators (VLCA). Based on the lessons learned from previous work from our team on VLCA [1], we present a new system design embodying a Reaction Force Sensing Series Elastic Actuator (RFSEA) and a Force Sensing Series Elastic Actuator (FSEA). These designs are aimed at reducing the size and weight of the robot's actuation system while inheriting the advantages of our designs such as energy efficiency, torque density, impact resistance and position/force controllability. The system design takes into consideration human-inspired kinematics and range-of-motion (ROM), while relying on foot placement to balance. In terms of actuator control, we perform a stability analysis on a Disturbance Observer (DOB) designed for force control. We then evaluate various position control algorithms both in the time and frequency domains for our VLCA actuators. Having the low level baseline established, we first perform a controller evaluation on the legs using Operational Space Control (OSC) [2]. Finally, we move on to evaluating the full bipedal robot by accomplishing unsupported dynamic walking by means of the algorithms to appear in [3].

ROMar 26, 2019
Efficient Trajectory Generation for Robotic Systems Constrained by Contact Forces

Jaemin Lee, Efstathios Bakolas, Luis Sentis

In this work, we propose a trajectory generation method for robotic systems with contact force constraint based on optimal control and reachability analysis. Normally, the dynamics and constraints of the contact-constrained robot are nonlinear and coupled to each other. Instead of linearizing the model and constraints, we directly solve the optimal control problem to obtain the feasible state trajectory and the control input of the system. A tractable optimal control problem is formulated which is addressed by dual approaches, which are sampling-based dynamic programming and rigorous reachability analysis. The sampling-based method and Partially Observable Markov Decision Process (POMDP) are used to break down the end-to-end trajectory generation problem via sample-wise optimization in terms of given conditions. The result generates sequential pairs of subregions to be passed to reach the final goal. The reachability analysis ensures that we will find at least one trajectory starting from a given initial state and going through a sequence of subregions. The distinctive contributions of our method are to enable handling the intricate contact constraint coupled with system's dynamics due to the reduction of computational complexity of the algorithm. We validate our method using extensive numerical simulations with a legged robot.

ROMar 22, 2019
Compliance Shaping for Control of Strength Amplification Exoskeletons with Elastic Cuffs

Gray Cortright Thomas, Jeremiah M. Coholich, Luis Sentis

Exoskeletons which amplify the strength of their operators can enable heavy-duty manipulation of unknown objects. However, this type of behavior is difficult to accomplish; it requires the exoskeleton to sense and amplify the operator's interaction forces while remaining stable. But, the goals of amplification and robust stability when connected to the operator fundamentally conflict. As a solution, we introduce a design with a spring in series with the force sensitive cuff. This allows us to design an exoskeleton compliance behavior which is nominally passive, even with high amplification ratios. In practice, time delay and discrete time filters prevent our strategy from actually achieving passivity, but the designed compliance still makes the exoskeleton more robust to spring-like human behaviors. Our exoskeleton is actuated by a series elastic actuator (SEA), which introduces another spring into the system. We show that shaping the cuff compliance for the exoskeleton can be made into approximately the same problem as shaping the spring compliance of an SEA. We therefore introduce a feedback controller and gain tuning method which takes advantage of an existing compliance shaping technique for SEAs. We call our strategy the "double compliance shaping" method. With large amplification ratios, this controller tends to amplify nonlinear transmission friction effects, so we additionally propose a "transmission disturbance observer" to mitigate this drawback. Our methods are validated on a single-degree-of-freedom elbow exoskeleton.

ROMar 4, 2019
Toward Achieving Formal Guarantees for Human-Aware Controllers in Human-Robot Interactions

Rachel Schlossman, Minkyu Kim, Ufuk Topcu et al.

With the primary objective of human-robot interaction being to support humans' goals, there exists a need to formally synthesize robot controllers that can provide the desired service. Synthesis techniques have the benefit of providing formal guarantees for specification satisfaction. There is potential to apply these techniques for devising robot controllers whose specifications are coupled with human needs. This paper explores the use of formal methods to construct human-aware robot controllers to support the productivity requirements of humans. We tackle these types of scenarios via human workload-informed models and reactive synthesis. This strategy allows us to synthesize controllers that fulfill formal specifications that are expressed as linear temporal logic formulas. We present a case study in which we reason about a work delivery and pickup task such that the robot increases worker productivity, but not stress induced by high work backlog. We demonstrate our controller using the Toyota HSR, a mobile manipulator robot. The results demonstrate the realization of a robust robot controller that is guaranteed to properly reason and react in collaborative tasks with human partners.

ROMar 2, 2019
Complex Stiffness Model of Physical Human-Robot Interaction: Implications for Control of Performance Augmentation Exoskeletons

Binghan He, Huang Huang, Gray C. Thomas et al.

Human joint dynamic stiffness plays an important role in the stability of performance augmentation exoskeletons. In this paper, we consider a new frequency domain model of the human joint dynamics which features a complex value stiffness. This complex stiffness consists of a real stiffness and a hysteretic damping. We use it to explain the dynamic behaviors of the human connected to the exoskeleton, in particular the observed non-zero low frequency phase shift and the near constant damping ratio of the resonant as stiffness and inertia vary. We validate this concept by experimenting with an elbow-joint exoskeleton testbed on a subject while modifying joint stiffness behavior, exoskeleton inertia, and strength augmentation gains. We compare three different models of elbow-joint dynamic stiffness: a model with real stiffness, viscous damping and inertia, a model with complex stiffness and inertia, and a model combining the previous two models. Our results show that the hysteretic damping term improves modeling accuracy, using a statistical F-test. Moreover this improvement is statistically more significant than using classical viscous damping term. In addition, we experimentally observe a linear relationship between the hysteretic damping and the real part of the stiffness which allows us to simplify the complex stiffness model as a 1-parameter system. Ultimately, we design a fractional order controller to demonstrate how human hysteretic damping behavior can be exploited to improve strength amplification performance while maintaining stability.

ROFeb 25, 2019
Robust and Adaptive Door Operation with a Mobile Robot

Miguel Arduengo, Carme Torras, Luis Sentis

The ability to deal with articulated objects is very important for robots assisting humans. In this work, a framework to robustly and adaptively operate common doors, using an autonomous mobile manipulator, is proposed. To push forward the state-of-the-art in robustness and speed performance, we devise a novel algorithm that fuses a convolutional neural network with efficient point cloud processing. This advancement enables real-time grasping pose estimation for multiple handles from RGB-D images, providing a speed up improvement for assistive human-centered applications. In addition, we propose a versatile Bayesian framework that endows the robot with the ability to infer the door kinematic model from observations of its motion and learn from previous experiences or human demonstrations. Combining these algorithms with a Task Space Region motion planner, we achieve an efficient door operation regardless of the kinematic model. We validate our framework with real-world experiments using the Toyota Human Support Robot.

ROFeb 1, 2019
Thermal Recovery of Multi-Limbed Robots with Electric Actuators

Steven Jens Jorgensen, James Holley, Frank Mathis et al.

The problem of finding thermally minimizing configurations of a humanoid robot to recover its actuators from unsafe thermal states is addressed. A first-order, data-driven, effort-based, thermal model of the robot's actuators is devised, which is used to predict future thermal states. Given this predictive capability, a map between configurations and future temperatures is formulated to find what configurations, subject to valid contact constraints, can be taken now to minimize future thermal states. Effectively, this approach is a realization of a contact-constrained thermal inverse-kinematics (IK) process. Experimental validation of the proposed approach is performed on the NASA Valkyrie robot hardware.

RODec 4, 2018
The Robot Economy: Here It Comes

Miguel Arduengo, Luis Sentis

Automation is not a new phenomenon, and questions about its effects have long followed its advances. More than a half-century ago, US President Lyndon B. Johnson established a national commission to examine the impact of technology on the economy, declaring that automation "can be the ally of our prosperity if we will just look ahead". In this paper, our premise is that we are at a technological inflection point in which robots are developing the capacity to greatly increase their cognitive and physical capabilities, and thus raising questions on labor dynamics. With increasing levels of autonomy and human-robot interaction, intelligent robots could soon accomplish new human-like capabilities such as engaging into social activities. Therefore, an increase in automation and autonomy brings the question of robots directly participating in some economic activities as autonomous agents. In this paper, a technological framework describing a robot economy is outlined and the challenges it might represent in the current socio-economic scenario are pondered.

RONov 27, 2018
Distributed Impedance Control of Latency-Prone Robotic Systems with Series Elastic Actuation

Ye Zhao, Luis Sentis

Robotic systems are increasingly relying on distributed feedback controllers to tackle complex and latency-prone sensing and decision problems. These demands come at the cost of a growing computational burden and, as a result, larger controller latencies. To maximize robustness to mechanical disturbances and achieve high control performance, we emphasize the necessity for executing damping feedback in close proximity to the control plant while allocating stiffness feedback in a latency-prone centralized control process. Additionally, series elastic actuators (SEAs) are becoming prevalent in torque-controlled robots during recent years to achieve compliant interactions with environments and humans. However, designing optimal impedance controllers and characterizing impedance performance for SEAs with time delays and filtering are still under-explored problems. The presented study addresses the optimal controller design problem by devising a critically-damped gain design method for a class of SEA cascaded control architectures, which is composed of outer-impedance and inner-torque feedback loops. Via the proposed controller design criterion, we adopt frequency-domain methods to thoroughly analyze the effects of time delays, filtering and load inertia on SEA impedance performance. These results are further validated through the analysis, simulation, and experimental testing on high-performance actuators and on an omnidirectional mobile base.

RONov 11, 2018
Reactive Task and Motion Planning for Robust Whole-Body Dynamic Locomotion in Constrained Environments

Ye Zhao, Yinan Li, Luis Sentis et al.

Contact-based decision and planning methods are becoming increasingly important to endow higher levels of autonomy for legged robots. Formal synthesis methods derived from symbolic systems have great potential for reasoning about high-level locomotion decisions and achieving complex maneuvering behaviors with correctness guarantees. This study takes a first step toward formally devising an architecture composed of task planning and control of whole-body dynamic locomotion behaviors in constrained and dynamically changing environments. At the high level, we formulate a two-player temporal logic game between the multi-limb locomotion planner and its dynamic environment to synthesize a winning strategy that delivers symbolic locomotion actions. These locomotion actions satisfy the desired high-level task specifications expressed in a fragment of temporal logic. Those actions are sent to a robust finite transition system that synthesizes a locomotion controller that fulfills state reachability constraints. This controller is further executed via a low-level motion planner that generates feasible locomotion trajectories. We construct a set of dynamic locomotion models for legged robots to serve as a template library for handling diverse environmental events. We devise a replanning strategy that takes into consideration sudden environmental changes or large state disturbances to increase the robustness of the resulting locomotion behaviors. We formally prove the correctness of the layered locomotion framework guaranteeing a robust implementation by the motion planning layer. Simulations of reactive locomotion behaviors in diverse environments indicate that our framework has the potential to serve as a theoretical foundation for intelligent locomotion behaviors.

RONov 8, 2018
LAAIR: A Layered Architecture for Autonomous Interactive Robots

Yuqian Jiang, Nick Walker, Minkyu Kim et al.

When developing general purpose robots, the overarching software architecture can greatly affect the ease of accomplishing various tasks. Initial efforts to create unified robot systems in the 1990s led to hybrid architectures, emphasizing a hierarchy in which deliberative plans direct the use of reactive skills. However, since that time there has been significant progress in the low-level skills available to robots, including manipulation and perception, making it newly feasible to accomplish many more tasks in real-world domains. There is thus renewed optimism that robots will be able to perform a wide array of tasks while maintaining responsiveness to human operators. However, the top layer in traditional hybrid architectures, designed to achieve long-term goals, can make it difficult to react quickly to human interactions during goal-driven execution. To mitigate this difficulty, we propose a novel architecture that supports such transitions by adding a top-level reactive module which has flexible access to both reactive skills and a deliberative control module. To validate this architecture, we present a case study of its application on a domestic service robot platform.

ROSep 27, 2018
Trajectory Generation for Robotic Systems with Contact Force Constraints

Jaemin Lee, Efstathios Bakolas, Luis Sentis

This paper presents a trajectory generation method for contact-constrained robotic systems such as manipulators and legged robots. Contact-constrained systems are affected by the interaction forces between the robot and the environment. In turn, these forces determine and constrain state reachability of the robot parts or end effectors. Our study subdivides the trajectory generation problem and the supporting reachability analysis into tractable subproblems consisting of a sampling problem, a convex optimization problem, and a nonlinear programming problem. Our method leads to significant reduction of computational cost. The proposed approach is validated using a realistic simulated contact-constrained robotic system.

ROSep 24, 2018
An Architecture for Person-Following using Active Target Search

Minkyu Kim, Miguel Arduengo, Nick Walker et al.

This paper addresses a novel architecture for person-following robots using active search. The proposed system can be applied in real-time to general mobile robots for learning features of a human, detecting and tracking, and finally navigating towards that person. To succeed at person-following, perception, planning, and robot behavior need to be integrated properly. Toward this end, an active target searching capability, including prediction and navigation toward vantage locations for finding human targets, is proposed. The proposed capability aims at improving the robustness and efficiency for tracking and following people under dynamic conditions such as crowded environments. A multi-modal sensor information approach including fusing an RGB-D sensor and a laser scanner, is pursued to robustly track and identify human targets. Bayesian filtering for keeping track of human and a regression algorithm to predict the trajectory of people are investigated. In order to make the robot autonomous, the proposed framework relies on a behavior-tree structure. Using Toyota Human Support Robot (HSR), real-time experiments demonstrate that the proposed architecture can generate fast, efficient person-following behaviors.

ROSep 24, 2018
Social Navigation Planning Based on People's Awareness of Robots

Minkyu Kim, Jaemin Lee, Steven Jens Jorgensen et al.

When mobile robots maneuver near people, they run the risk of rudely blocking their paths; but not all people behave the same around robots. People that have not noticed the robot are the most difficult to predict. This paper investigates how mobile robots can generate acceptable paths in dynamic environments by predicting human behavior. Here, human behavior may include both physical and mental behavior, we focus on the latter. We introduce a simple safe interaction model: when a human seems unaware of the robot, it should avoid going too close. In this study, people around robots are detected and tracked using sensor fusion and filtering techniques. To handle uncertainties in the dynamic environment, a Partially-Observable Markov Decision Process Model (POMDP) is used to formulate a navigation planning problem in the shared environment. People's awareness of robots is inferred and included as a state and reward model in the POMDP. The proposed planner enables a robot to change its navigation plan based on its perception of each person's robot-awareness. As far as we can tell, this is a new capability. We conduct simulation and experiments using the Toyota Human Support Robot (HSR) to validate our approach. We demonstrate that the proposed framework is capable of running in real-time.

ROSep 24, 2018
Prioritized Kinematic Control of Joint-Constrained Head-Eye Robots using the Intermediate Value Approach

Steven Jens Jorgensen, Orion Campbell, Travis Llado et al.

Existing gaze controllers for head-eye robots can only handle single fixation points. Here, a generic controller for head-eye robots capable of executing simultaneous and prioritized fixation trajectories in Cartesian space is presented. This enables the specification of multiple operational-space behaviors with priority such that the execution of a low priority head orientation task does not disturb the satisfaction of a higher prioritized eye gaze task. Through our approach, the head-eye robot inherently gains the biomimetic vestibulo-ocular reflex (VOR), which is the ability of gaze stabilization under self generated movements. The described controller utilizes recursive null space projections to encode joint limit constraints and task priorities. To handle the solution discontinuity that occurs when joint limit tasks are inserted or removed as a constraint, the Intermediate Desired Value (IDV) approach is applied. Experimental validation of the controller's properties is demonstrated with the Dreamer humanoid robot. Our contribution is on (1) the formulation of a desired gaze task as an operational space orientation task, (2) the application details of the IDV approach for the prioritized head-eye robot controller that can handle intermediate joint constraints, and (3) a minimum-jerk specification for behavior and trajectory generation in Cartesian space.

ROJul 9, 2018
Fast Kinodynamic Bipedal Locomotion Planning with Moving Obstacles

Junhyeok Ahn, Orion Campbell, Donghyun Kim et al.

We present a sampling-based kinodynamic planning framework for a bipedal robot in complex environments. Unlike other footstep planner which typically plan footstep locations and the biped dynamics in separate steps, we handle both simultaneously. Three advantages of this approach are (1) the ability to differentiate alternate routes while selecting footstep locations based on the temporal duration of the route as determined by the Linear Inverted Pendulum Model dynamics, (2) the ability to perform collision checking through time so that collisions with moving obstacles are prevented without avoiding their entire trajectory, and (3) the ability to specify a minimum forward velocity for the biped. To generate a dynamically consistent description of the walking behavior, we exploit the Phase Space Planner. To plan a collision free route toward the goal, we adapt planning strategies from non-holonomic wheeled robots to gather a sequence of inputs for the PSP. This allows us to efficiently approximate dynamic and kinematic constraints on bipedal motion, to apply a sampling based planning algorithms, and to use the Dubin's path as the steering method to connect two points in the configuration space. The results of the algorithm are sent to a Whole Body Controller to generate full body dynamic walking behavior.

ROJul 3, 2018
Computationally-Robust and Efficient Prioritized Whole-Body Controller with Contact Constraints

Donghyun Kim, Jaemin Lee, Orion Campbell et al.

In this paper, we devise methods for the multi- objective control of humanoid robots, a.k.a. prioritized whole- body controllers, that achieve efficiency and robustness in the algorithmic computations. We use a form of whole-body controllers that is very general via incorporating centroidal momentum dynamics, operational task priorities, contact re- action forces, and internal force constraints. First, we achieve efficiency by solving a quadratic program that only involves the floating base dynamics and the reaction forces. Second, we achieve computational robustness by relaxing task accelerations such that they comply with friction cone constraints. Finally, we incorporate methods for smooth contact transitions to enhance the control of dynamic locomotion behaviors. The proposed methods are demonstrated both in simulation and in real experiments using a passive-ankle bipedal robot.

ROMar 29, 2018
Decentralized Control Systems Laboratory Using Human Centered Robotic Actuators

Binghan He, Kunye Chen, Rachel Schlossman et al.

University laboratories deliver unique hands-on experimentation for STEM students but often lack state-of-the-art equipment and provide limited access to their equipment. The University of Texas Cloud Laboratory provides remote access to a cutting-edge series elastic actuators for student experimentation regarding human-centered robotics, dynamical systems, and controls. Through a browser-based interface, students are provided with various learning materials using the remote hardware-in-the-loop system for effective experiment-based education. This paper discusses the methods used to connect remote hardware to mobile browsers, the adaptation of textbook materials regarding system identification and feedback control, data processing to generate clean and useful results for student interpretation, and initial usage of the end-to-end system for individual and group learning.

ROMar 5, 2018
On Blocking Collisions between People, Objects and other Robots

Kwan Suk Kim, Luis Sentis

Intentional or unintentional contacts are bound to occur increasingly more often due to the deployment of autonomous systems in human environments. In this paper, we devise methods to computationally predict imminent collisions between objects, robots and people, and use an upper-body humanoid robot to block them if they are likely to happen. We employ statistical methods for effective collision prediction followed by sensor-based trajectory generation and real-time control to attempt to stop the likely collisions using the most favorable part of the blocking robot. We thoroughly investigate collisions in various types of experimental setups involving objects, robots, and people. Overall, the main contribution of this paper is to devise sensor-based prediction, trajectory generation and control processes for highly articulated robots to prevent collisions against people, and conduct numerous experiments to validate this approach.

ROFeb 27, 2018
Exploiting the Natural Dynamics of Series Elastic Robots by Actuator-Centered Sequential Linear Programming

Rachel Schlossman, Gray C. Thomas, Orion Campbell et al.

Series elastic robots are best able to follow trajectories which obey the limitations of their actuators, since they cannot instantly change their joint forces. In fact, the performance of series elastic actuators can surpass that of ideal force source actuators by storing and releasing energy. In this paper, we formulate the trajectory optimization problem for series elastic robots in a novel way based on sequential linear programming. Our framework is unique in the separation of the actuator dynamics from the rest of the dynamics, and in the use of a tunable pseudo-mass parameter that improves the discretization accuracy of our approach. The actuator dynamics are truly linear, which allows them to be excluded from trust-region mechanics. This causes our algorithm to have similar run times with and without the actuator dynamics. We demonstrate our optimization algorithm by tuning high performance behaviors for a single-leg robot in simulation and on hardware for a single degree-of-freedom actuator testbed. The results show that compliance allows for faster motions and takes a similar amount of computation time.

ROAug 19, 2017
Robust Optimal Planning and Control of Non-Periodic Bipedal Locomotion with A Centroidal Momentum Model

Ye Zhao, Benito R. Fernandez, Luis Sentis

This study presents a theoretical method for planning and controlling agile bipedal locomotion based on robustly tracking a set of non-periodic keyframe states. Based on centroidal momentum dynamics, we formulate a hybrid phase-space planning and control method which includes the following key components: (i) a step transition solver that enables dynamically tracking non-periodic keyframe states over various types of terrains, (ii) a robust hybrid automaton to effectively formulate planning and control algorithms, (iii) a steering direction model to control the robot's heading, (iv) a phase-space metric to measure distance to the planned locomotion manifolds, and (v) a hybrid control method based on the previous distance metric to produce robust dynamic locomotion under external disturbances. Compared to other locomotion methodologies, we have a large focus on non-periodic gait generation and robustness metrics to deal with disturbances. Such focus enables the proposed control method to robustly track non-periodic keyframe states over various challenging terrains and under external disturbances as illustrated through several simulations.

ROAug 7, 2017
Robust Dynamic Locomotion via Reinforcement Learning and Novel Whole Body Controller

Donghyun Kim, Jaemin Lee, Luis Sentis

We propose a robust dynamic walking controller consisting of a dynamic locomotion planner, a reinforcement learning process for robustness, and a novel whole-body locomotion controller (WBLC). Previous approaches specify either the position or the timing of steps, however, the proposed locomotion planner simultaneously computes both of these parameters as locomotion outputs. Our locomotion strategy relies on devising a reinforcement learning (RL) approach for robust walking. The learned policy generates multi step walking patterns, and the process is quick enough to be suitable for real-time controls. For learning, we devise an RL strategy that uses a phase space planner (PSP) and a linear inverted pendulum model to make the problem tractable and very fast. Then, the learned policy is used to provide goal-based commands to the WBLC, which calculates the torque commands to be executed in full-humanoid robots. The WBLC combines multiple prioritized tasks and calculates the associated reaction forces based on practical inequality constraints. The novel formulation includes efficient calculation of the time derivatives of various Jacobians. This provides high-fidelity dynamic control of fast motions. More specifically, we compute the time derivative of the Jacobian for various tasks and the Jacobian of the centroidal momentum task by utilizing Lie group operators and operational space dynamics respectively. The integration of RL-PSP and the WBLC provides highly robust, versatile, and practical locomotion including steering while walking and handling push disturbances of up to 520 N during an interval of 0.1 sec. Theoretical and numerical results are tested through a 3D physics-based simulation of the humanoid robot Valkyrie.