ROSep 20, 2024
SoloParkour: Constrained Reinforcement Learning for Visual Locomotion from Privileged ExperienceElliot Chane-Sane, Joseph Amigo, Thomas Flayols et al.
Parkour poses a significant challenge for legged robots, requiring navigation through complex environments with agility and precision based on limited sensory inputs. In this work, we introduce a novel method for training end-to-end visual policies, from depth pixels to robot control commands, to achieve agile and safe quadruped locomotion. We formulate robot parkour as a constrained reinforcement learning (RL) problem designed to maximize the emergence of agile skills within the robot's physical limits while ensuring safety. We first train a policy without vision using privileged information about the robot's surroundings. We then generate experience from this privileged policy to warm-start a sample efficient off-policy RL algorithm from depth images. This allows the robot to adapt behaviors from this privileged experience to visual locomotion while circumventing the high computational costs of RL directly from pixels. We demonstrate the effectiveness of our method on a real Solo-12 robot, showcasing its capability to perform a variety of parkour skills such as walking, climbing, leaping, and crawling.
LGApr 5, 2022
Model Based Meta Learning of Critics for Policy GradientsSarah Bechtle, Ludovic Righetti, Franziska Meier
Being able to seamlessly generalize across different tasks is fundamental for robots to act in our world. However, learning representations that generalize quickly to new scenarios is still an open research problem in reinforcement learning. In this paper we present a framework to meta-learn the critic for gradient-based policy learning. Concretely, we propose a model-based bi-level optimization algorithm that updates the critics parameters such that the policy that is learned with the updated critic gets closer to solving the meta-training tasks. We illustrate that our algorithm leads to learned critics that resemble the ground truth Q function for a given task. Finally, after meta-training, the learned critic can be used to learn new policies for new unseen task and environment settings via model-free policy gradient optimization, without requiring a model. We present results that show the generalization capabilities of our learned critic to new tasks and dynamics when used to learn a new policy in a new scenario.
ROFeb 5
Coupled Local and Global World Models for Efficient First Order RLJoseph Amigo, Rooholla Khorrambakht, Nicolas Mansard et al.
World models offer a promising avenue for more faithfully capturing complex dynamics, including contacts and non-rigidity, as well as complex sensory information, such as visual perception, in situations where standard simulators struggle. However, these models are computationally complex to evaluate, posing a challenge for popular RL approaches that have been successfully used with simulators to solve complex locomotion tasks but yet struggle with manipulation. This paper introduces a method that bypasses simulators entirely, training RL policies inside world models learned from robots' interactions with real environments. At its core, our approach enables policy training with large-scale diffusion models via a novel decoupled first-order gradient (FoG) method: a full-scale world model generates accurate forward trajectories, while a lightweight latent-space surrogate approximates its local dynamics for efficient gradient computation. This coupling of a local and global world model ensures high-fidelity unrolling alongside computationally tractable differentiation. We demonstrate the efficacy of our method on the Push-T manipulation task, where it significantly outperforms PPO in sample efficiency. We further evaluate our approach through an ego-centric object manipulation task with a quadruped. Together, these results demonstrate that learning inside data-driven world models is a promising pathway for solving hard-to-model RL tasks in image space without reliance on hand-crafted physics simulators.
ROOct 16, 2020Code
Variable Horizon MPC with Swing Foot Dynamics for Bipedal Walking ControlElham Daneshmand, Majid Khadiv, Felix Grimminger et al.
In this paper, we present a novel two-level variable Horizon Model Predictive Control (VH-MPC) framework for bipedal locomotion. In this framework, the higher level computes the landing location and timing (horizon length) of the swing foot to stabilize the unstable part of the center of mass (CoM) dynamics, using feedback from the CoM states. The lower level takes into account the swing foot dynamics and generates dynamically consistent trajectories for landing at the desired time as close as possible to the desired location. To do that, we use a simplified model of the robot dynamics projected in swing foot space that takes into account joint torque constraints as well as the friction cone constraints of the stance foot. We show the effectiveness of our proposed control framework by implementing robust walking patterns on our torque-controlled and open-source biped robot, Bolt. We report extensive simulations and real robot experiments in the presence of various disturbances and uncertainties.
ROAug 8, 2020Code
TriFinger: An Open-Source Robot for Learning DexterityManuel Wüthrich, Felix Widmaier, Felix Grimminger et al.
Dexterous object manipulation remains an open problem in robotics, despite the rapid progress in machine learning during the past decade. We argue that a hindrance is the high cost of experimentation on real systems, in terms of both time and money. We address this problem by proposing an open-source robotic platform which can safely operate without human supervision. The hardware is inexpensive (about \SI{5000}[\$]{}) yet highly dynamic, robust, and capable of complex interaction with external objects. The software operates at 1-kilohertz and performs safety checks to prevent the hardware from breaking. The easy-to-use front-end (in C++ and Python) is suitable for real-time control as well as deep reinforcement learning. In addition, the software framework is largely robot-agnostic and can hence be used independently of the hardware proposed herein. Finally, we illustrate the potential of the proposed platform through a number of experiments, including real-time optimal control, deep reinforcement learning from scratch, throwing, and writing.
ROSep 30, 2019Code
An Open Torque-Controlled Modular Robot Architecture for Legged Locomotion ResearchFelix Grimminger, Avadesh Meduri, Majid Khadiv et al.
We present a new open-source torque-controlled legged robot system, with a low-cost and low-complexity actuator module at its core. It consists of a high-torque brushless DC motor and a low-gear-ratio transmission suitable for impedance and force control. We also present a novel foot contact sensor suitable for legged locomotion with hard impacts. A 2.2 kg quadruped robot with a large range of motion is assembled from eight identical actuator modules and four lower legs with foot contact sensors. Leveraging standard plastic 3D printing and off-the-shelf parts results in a lightweight and inexpensive robot, allowing for rapid distribution and duplication within the research community. We systematically characterize the achieved impedance at the foot in both static and dynamic scenarios, and measure a maximum dimensionless leg stiffness of 10.8 without active damping, which is comparable to the leg stiffness of a running human. Finally, to demonstrate the capabilities of the quadruped, we present a novel controller which combines feedforward contact forces computed from a kino-dynamic optimizer with impedance control of the center of mass and base orientation. The controller can regulate complex motions while being robust to environmental uncertainty.
ROSep 11, 2019Code
Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal ControlCarlos Mastalli, Rohan Budhiraja, Wolfgang Merkt et al.
We introduce Crocoddyl (Contact RObot COntrol by Differential DYnamic Library), an open-source framework tailored for efficient multi-contact optimal control. Crocoddyl efficiently computes the state trajectory and the control policy for a given predefined sequence of contacts. Its efficiency is due to the use of sparse analytical derivatives, exploitation of the problem structure, and data sharing. It employs differential geometry to properly describe the state of any geometrical system, e.g. floating-base systems. Additionally, we propose a novel optimal control algorithm called Feasibility-driven Differential Dynamic Programming (FDDP). Our method does not add extra decision variables which often increases the computation time per iteration due to factorization. FDDP shows a greater globalization strategy compared to classical Differential Dynamic Programming (DDP) algorithms. Concretely, we propose two modifications to the classical DDP algorithm. First, the backward pass accepts infeasible state-control trajectories. Second, the rollout keeps the gaps open during the early "exploratory" iterations (as expected in multiple-shooting methods with only equality constraints). We showcase the performance of our framework using different tasks. With our method, we can compute highly-dynamic maneuvers (e.g. jumping, front-flip) within few milliseconds.
LGJun 12, 2019Code
Meta-Learning via Learned LossSarah Bechtle, Artem Molchanov, Yevgen Chebotar et al.
Typically, loss functions, regularization mechanisms and other important aspects of training parametric models are chosen heuristically from a limited set of options. In this paper, we take the first step towards automating this process, with the view of producing models which train faster and more robustly. Concretely, we present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures. We develop a pipeline for meta-training such loss functions, targeted at maximizing the performance of the model trained under them. The loss landscape produced by our learned losses significantly improves upon the original task-specific losses in both supervised and reinforcement learning tasks. Furthermore, we show that our meta-learning framework is flexible enough to incorporate additional information at meta-train time. This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time. We make our code available at https://sites.google.com/view/mlthree.
ROApr 26
Cooptimizing Safety and Performance Using Safety Value-Constrained Model Predictive ControlHao Wang, Nam Nguyen, Armand Jordana et al.
Autonomous systems are increasingly deployed in real-world environments, where they must achieve high performance while maintaining safety under state and input constraints. Although Model Predictive Control (MPC) provides a principled framework for constrained optimal control, guaranteeing safety beyond its finite planning horizon remains a fundamental challenge. In this work, we augment MPC with a safety value function-based terminal constraint that enforces membership in a control-invariant safe set at the end of each planning horizon. This formulation enables real-time synthesis of trajectories that are both high-performing and provably safe. We show that, under an exact safety value function and a feasible initialization, the proposed MPC scheme is recursively feasible, thereby ensuring persistent safety. In contrast to traditional terminal set constructions that rely on local linearizations or conservative approximations, our approach incorporates a reachability-based safety value function for terminal constraints, yielding less conservative and more expressive safety guarantees. We validate the proposed framework through simulation and hardware experiments on a Flexiv Rizon 10s manipulator. Results demonstrate improved constraint satisfaction and robustness compared to standard state-constrained MPC and reactive safety filtering, while maintaining competitive task performance. The full implementation and experiments are available on the project website.
LGMay 13, 2025
Cost Function Estimation Using Inverse Reinforcement Learning with Minimal ObservationsSarmad Mehrdad, Avadesh Meduri, Ludovic Righetti
We present an iterative inverse reinforcement learning algorithm to infer optimal cost functions in continuous spaces. Based on a popular maximum entropy criteria, our approach iteratively finds a weight improvement step and proposes a method to find an appropriate step size that ensures learned cost function features remain similar to the demonstrated trajectory features. In contrast to similar approaches, our algorithm can individually tune the effectiveness of each observation for the partition function and does not need a large sample set, enabling faster learning. We generate sample trajectories by solving an optimal control problem instead of random sampling, leading to more informative trajectories. The performance of our method is compared to two state of the art algorithms to demonstrate its benefits in several simulated environments.
ROMar 8
Toward Global Intent Inference for Human Motion by Inverse Reinforcement LearningSarmad Mehrdad, Maxime Sabbah, Vincent Bonnet et al.
This paper investigates whether a single, unified cost function can explain and predict human reaching movements, in contrast with existing approaches that rely on subject- or posture-specific optimization criteria. Using the Minimal Observation Inverse Reinforcement Learning (MO-IRL) algorithm, together with a seven-dimensional set of candidate cost terms, we efficiently estimate time-varying cost weights for a standard planar reaching task. MO-IRL provides orders-of-magnitude faster convergence than bilevel formulations, while using only a fraction of the available data, enabling the practical exploration of time-varying cost structures. Three levels of generality are evaluated: Subject-Dependent Posture-Dependent, Subject-Dependent Posture-Independent, and Subject-Independent Posture-Independent. Across all cases, time-varying weights substantially improve trajectory reconstruction, yielding an average 27% reduction in RMSE compared to the baseline. The inferred costs consistently highlight a dominant role for joint-acceleration regulation, complemented by smaller contributions from torque-change smoothness. Overall, a single subject- and posture-agnostic time-varying cost function is shown to predict human reaching trajectories with high accuracy, supporting the existence of a unified optimality principle governing this class of movements.
ROAug 29, 2025
First Order Model-Based RL through Decoupled BackpropagationJoseph Amigo, Rooholla Khorrambakht, Elliot Chane-Sane et al.
There is growing interest in reinforcement learning (RL) methods that leverage the simulator's derivatives to improve learning efficiency. While early gradient-based approaches have demonstrated superior performance compared to derivative-free methods, accessing simulator gradients is often impractical due to their implementation cost or unavailability. Model-based RL (MBRL) can approximate these gradients via learned dynamics models, but the solver efficiency suffers from compounding prediction errors during training rollouts, which can degrade policy performance. We propose an approach that decouples trajectory generation from gradient computation: trajectories are unrolled using a simulator, while gradients are computed via backpropagation through a learned differentiable model of the simulator. This hybrid design enables efficient and consistent first-order policy optimization, even when simulator gradients are unavailable, as well as learning a critic from simulation rollouts, which is more accurate. Our method achieves the sample efficiency and speed of specialized optimizers such as SHAC, while maintaining the generality of standard approaches like PPO and avoiding ill behaviors observed in other first-order MBRL methods. We empirically validate our algorithm on benchmark control tasks and demonstrate its effectiveness on a real Go2 quadruped robot, across both quadrupedal and bipedal locomotion tasks.
ROFeb 25, 2022
On the Use of Torque Measurement in Centroidal State EstimationShahram Khorshidi, Ahmad Gazar, Nicholas Rotella et al.
State of the art legged robots are either capable of measuring torque at the output of their drive systems, or have transparent drive systems which enable the computation of joint torques from motor currents. In either case, this sensor modality is seldom used in state estimation. In this paper, we propose to use joint torque measurements to estimate the centroidal states of legged robots. To do so, we project the whole-body dynamics of a legged robot into the nullspace of the contact constraints, allowing expression of the dynamics independent of the contact forces. Using the constrained dynamics and the centroidal momentum matrix, we are able to directly relate joint torques and centroidal states dynamics. Using the resulting model as the process model of an Extended Kalman Filter (EKF), we fuse the torque measurement in the centroidal state estimation problem. Through real-world experiments on a quadruped robot with different gaits, we demonstrate that the estimated centroidal states from our torque-based EKF drastically improve the recovery of these quantities compared to direct computation.
ROJan 19, 2022
BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion PlanningAvadesh Meduri, Paarth Shah, Julian Viereck et al.
Online planning of whole-body motions for legged robots is challenging due to the inherent nonlinearity in the robot dynamics. In this work, we propose a nonlinear MPC framework, the BiConMP which can generate whole body trajectories online by efficiently exploiting the structure of the robot dynamics. BiConMP is used to generate various cyclic gaits on a real quadruped robot and its performance is evaluated on different terrain, countering unforeseen pushes and transitioning online between different gaits. Further, the ability of BiConMP to generate non-trivial acyclic whole-body dynamic motions on the robot is presented. The same approach is also used to generate various dynamic motions in MPC on a humanoid robot (Talos) and another quadruped robot (AnYmal) in simulation. Finally, an extensive empirical analysis on the effects of planning horizon and frequency on the nonlinear MPC framework is reported and discussed.
ROJan 11, 2022
ValueNetQP: Learned one-step optimal control for legged locomotionJulian Viereck, Avadesh Meduri, Ludovic Righetti
Optimal control is a successful approach to generate motions for complex robots, in particular for legged locomotion. However, these techniques are often too slow to run in real time for model predictive control or one needs to drastically simplify the dynamics model. In this work, we present a method to learn to predict the gradient and hessian of the problem value function, enabling fast resolution of the predictive control problem with a one-step quadratic program. In addition, our method is able to satisfy constraints like friction cones and unilateral constraints, which are important for high dynamics locomotion tasks. We demonstrate the capability of our method in simulation and on a real quadruped robot performing trotting and bounding motions.
ROOct 27, 2021
Millimeter Wave Wireless Assisted Robot Navigation with Link State ClassificationMingsheng Yin, Akshaj Veldanda, Amee Trivedi et al.
The millimeter wave (mmWave) bands have attracted considerable attention for high precision localization applications due to the ability to capture high angular and temporal resolution measurements. This paper explores mmWave-based positioning for a target localization problem where a fixed target broadcasts mmWave signals and a mobile robotic agent attempts to capture the signals to locate and navigate to the target. A three-stage procedure is proposed: First, the mobile agent uses tensor decomposition methods to detect the multipath channel components and estimate their parameters. Second, a machine-learning trained classifier is then used to predict the link state, meaning if the strongest path is line-of-sight (LOS) or non-LOS (NLOS). For the NLOS case, the link state predictor also determines if the strongest path arrived via one or more reflections. Third, based on the link state, the agent either follows the estimated angles or uses computer vision or other sensor to explore and map the environment. The method is demonstrated on a large dataset of indoor environments supplemented with ray tracing to simulate the wireless propagation. The path estimation and link state classification are also integrated into a state-of-the-art neural simultaneous localization and mapping (SLAM) module to augment camera and LIDAR-based navigation. It is shown that the link state classifier can successfully generalize to completely new environments outside the training set. In addition, the neural-SLAM module with the wireless path estimation and link state classifier provides rapid navigation to the target, close to a baseline that knows the target location.
ROOct 18, 2021
A unified framework for walking and running of bipedal robotsMahrokh Ghoddousi Boroujeni, Elham Daneshmand, Ludovic Righetti et al.
In this paper, we propose a novel framework capable of generating various walking and running gaits for bipedal robots. The main goal is to relax the fixed center of mass (CoM) height assumption of the linear inverted pendulum model (LIPM) and generate a wider range of walking and running motions, without a considerable increase in complexity. To do so, we use the concept of virtual constraints in the centroidal space which enables generating motions beyond walking while keeping the complexity at a minimum. By a proper choice of these virtual constraints, we show that we can generate different types of walking and running motions. More importantly, enforcing the virtual constraints through feedback renders the dynamics linear and enables us to design a feedback control mechanism which adapts the next step location and timing in face of disturbances, through a simple quadratic program (QP). To show the effectiveness of this framework, we showcase different walking and running simulations of the biped robot Bolt in the presence of both environmental uncertainties and external disturbances.
ROAug 4, 2021
Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate DescentPaarth Shah, Avadesh Meduri, Wolfgang Merkt et al.
In this paper we explore the use of block coordinate descent (BCD) to optimize the centroidal momentum dynamics for dynamically consistent multi-contact behaviors. The centroidal dynamics have recently received a large amount of attention in order to create physically realizable motions for robots with hands and feet while being computationally more tractable than full rigid body dynamics models. Our contribution lies in exploiting the structure of the dynamics in order to simplify the original non-convex problem into two convex subproblems. We iterate between these two subproblems for a set number of iterations or until a consensus is reached. We explore the properties of the proposed optimization method for the centroidal dynamics and verify in simulation that motions generated by our approach can be tracked by the quadruped Solo12. In addition, we compare our method to a recently proposed convexification using a sequence of convex relaxations as well as a more standard interior point method used in the off- the-shelf solver IPOPT to show that our approach finds similar, if not better, trajectories (in terms of cost), and is more than four times faster than both approaches. Finally, compared to previous approaches, we note its practicality due to the convex nature of each subproblem which allows our method to be used with any off-the-shelf quadratic programming solver.
ROJul 14, 2021
Model-free Reinforcement Learning for Robust Locomotion using Demonstrations from Trajectory OptimizationMiroslav Bogdanovic, Majid Khadiv, Ludovic Righetti
We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot.
OCJun 25, 2021
$\mathcal{N}$IPM-HLSP: An Efficient Interior-Point Method for Hierarchical Least-Squares ProgramsKai Pfeiffer, Adrien Escande, Ludovic Righetti
Hierarchical least-squares programs with linear constraints (HLSP) are a type of optimization problem very common in robotics. Each priority level contains an objective in least-squares form which is subject to the linear constraints of the higher priority levels. Active-set methods are a popular choice for solving them. However, they can perform poorly in terms of computational time if there are large changes of the active set. We therefore propose a computationally efficient primal-dual interior-point method (IPM) for dense HLSP's which is able to maintain constant numbers of solver iterations in these situations. We base our IPM on the computationally efficient nullspace method as it requires only a single matrix factorization per solver iteration instead of two as it is the case for other IPM formulations. We show that the resulting normal equations can be expressed in least-squares form. This avoids the formation of the quadratic Lagrangian Hessian and can possibly maintain high levels of sparsity. Our solver reliably solves ill-posed instantaneous hierarchical robot control problems without exhibiting the large variations in computation time seen in active-set methods.
LGJun 22, 2021
Learning Dynamical Systems from Noisy Sensor Measurements using Multiple ShootingArmand Jordana, Justin Carpentier, Ludovic Righetti
Modeling dynamical systems plays a crucial role in capturing and understanding complex physical phenomena. When physical models are not sufficiently accurate or hardly describable by analytical formulas, one can use generic function approximators such as neural networks to capture the system dynamics directly from sensor measurements. As for now, current methods to learn the parameters of these neural networks are highly sensitive to the inherent instability of most dynamical systems of interest, which in turn prevents the study of very long sequences. In this work, we introduce a generic and scalable method based on multiple shooting to learn latent representations of indirectly observed dynamical systems. We achieve state-of-the-art performances on systems observed directly from raw images. Further, we demonstrate that our method is robust to noisy measurements and can handle complex dynamical systems, such as chaotic ones.
ROApr 25, 2021
A Robustness Analysis of Inverse Optimal Control of Bipedal WalkingJohn R. Rebula, Stefan Schaal, James Finley et al.
Cost functions have the potential to provide compact and understandable generalizations of motion. The goal of Inverse Optimal Control (IOC) is to analyze an observed behavior which is assumed to be optimal with respect to an unknown cost function, and infer this cost function. Here we develop a method for characterizing cost functions of legged locomotion, with the goal of representing complex humanoid behavior with simple models. To test this methodology we simulate walking gaits of a simple 5 link planar walking model which optimize known cost functions, and assess the ability of our IOC method to recover them. In particular, the IOC method uses an iterative trajectory optimization process to infer cost function weightings consistent with those used to generate a single demonstrated optimal trial. We also explore sensitivity of the IOC to sensor noise in the observed trajectory, imperfect knowledge of the model or task, as well as uncertainty in the components of the cost function used. With appropriate modeling, these methods may help infer cost functions from human data, yielding a compact and generalizable representation of human-like motion for use in humanoid robot controllers, as well as providing a new tool for experimentally exploring human preferences.
ROMar 31, 2021
Simultaneous Navigation and Construction Benchmarking EnvironmentsWenyu Han, Chen Feng, Haoran Wu et al.
We need intelligent robots for mobile construction, the process of navigating in an environment and modifying its structure according to a geometric design. In this task, a major robot vision and learning challenge is how to exactly achieve the design without GPS, due to the difficulty caused by the bi-directional coupling of accurate robot localization and navigation together with strategic environment manipulation. However, many existing robot vision and learning tasks such as visual navigation and robot manipulation address only one of these two coupled aspects. To stimulate the pursuit of a generic and adaptive solution, we reasonably simplify mobile construction as a partially observable Markov decision process (POMDP) in 1/2/3D grid worlds and benchmark the performance of a handcrafted policy with basic localization and planning, and state-of-the-art deep reinforcement learning (RL) methods. Our extensive experiments show that the coupling makes this problem very challenging for those methods, and emphasize the need for novel task-specific solutions.
ROJan 18, 2021
Exponential Integration for Efficient and Accurate Multi-Body Simulation with Stiff Viscoelastic ContactsBilal Hammoud, Luca Olivieri, Ludovic Righetti et al.
The simulation of multi-body systems with frictional contacts is a fundamental tool for many fields, such as robotics, computer graphics, and mechanics. Hard frictional contacts are particularly troublesome to simulate because they make the differential equations stiff, calling for computationally demanding implicit integration schemes. We suggest to tackle this issue by using exponential integrators, a long-standing class of integration schemes (first introduced in the 60's) that in recent years has enjoyed a resurgence of interest. We show that this scheme can be easily applied to multi-body systems subject to stiff viscoelastic contacts, producing accurate results at lower computational cost than \changed{classic explicit or implicit schemes}. In our tests with quadruped and biped robots, our method demonstrated stable behaviors with large time steps (10 ms) and stiff contacts ($10^5$ N/m). Its excellent properties, especially for fast and coarse simulations, make it a valuable candidate for many applications in robotics, such as simulation, Model Predictive Control, Reinforcement Learning, and controller design.
RONov 9, 2020
Impedance Optimization for Uncertain Contact Interactions Through Risk Sensitive Optimal ControlBilal Hammoud, Majid Khadiv, Ludovic Righetti
This paper addresses the problem of computing optimal impedance schedules for legged locomotion tasks involving complex contact interactions. We formulate the problem of impedance regulation as a trade-off between disturbance rejection and measurement uncertainty. We extend a stochastic optimal control algorithm known as Risk Sensitive Control to take into account measurement uncertainty and propose a formal way to include such uncertainty for unknown contact locations. The approach can efficiently generate optimal state and control trajectories along with local feedback control gains, i.e. impedance schedules. Extensive simulations demonstrate the capabilities of the approach in generating meaningful stiffness and damping modulation patterns before and after contact interaction. For example, contact forces are reduced during early contacts, damping increases to anticipate a high impact event and tracking is automatically traded-off for increased stability. In particular, we show a significant improvement in performance during jumping and trotting tasks with a simulated quadruped robot.
RONov 7, 2020
Leveraging Forward Model Prediction Error for Learning ControlSarah Bechtle, Bilal Hammoud, Akshara Rai et al.
Learning for model based control can be sample-efficient and generalize well, however successfully learning models and controllers that represent the problem at hand can be challenging for complex tasks. Using inaccurate models for learning can lead to sub-optimal solutions, that are unlikely to perform well in practice. In this work, we present a learning approach which iterates between model learning and data collection and leverages forward model prediction error for learning control. We show how using the controller's prediction as input to a forward model can create a differentiable connection between the controller and the model, allowing us to formulate a loss in the state space. This lets us include forward model prediction error during controller learning and we show that this creates a loss objective that significantly improves learning on different motor control tasks. We provide empirical and theoretical results that show the benefits of our method and present evaluations in simulation for learning control on a 7 DoF manipulator and an underactuated 12 DoF quadruped. We show that our approach successfully learns controllers for challenging motor control tasks involving contact switching.
RONov 5, 2020
Learning a Centroidal Motion Planner for Legged LocomotionJulian Viereck, Ludovic Righetti
Whole-body optimizers have been successful at automatically computing complex dynamic locomotion behaviors. However they are often limited to offline planning as they are computationally too expensive to replan with a high frequency. Simpler models are then typically used for online replanning. In this paper we present a method to generate whole body movements in real-time for locomotion tasks. Our approach consists in learning a centroidal neural network that predicts the desired centroidal motion given the current state of the robot and a desired contact plan. The network is trained using an existing whole body motion optimizer. Our approach enables to learn with few training samples dynamic motions that can be used in a complete whole-body control framework at high frequency, which is usually not attainable with typical full-body optimizers. We demonstrate our method to generate a rich set of walking and jumping motions on a real quadruped robot.
ROOct 28, 2020
DeepQ Stepper: A framework for reactive dynamic walking on uneven terrainAvadesh Meduri, Majid Khadiv, Ludovic Righetti
Reactive stepping and push recovery for biped robots is often restricted to flat terrains because of the difficulty in computing capture regions for nonlinear dynamic models. In this paper, we address this limitation by using reinforcement learning to approximately learn the 3D capture region for such systems. We propose a novel 3D reactive stepper, The DeepQ stepper, that computes optimal step locations for walking at different velocities using the 3D capture regions approximated by the action-value function. We demonstrate the ability of the approach to learn stepping with a simplified 3D pendulum model and a full robot dynamics. Further, the stepper achieves a higher performance when it learns approximate capture regions while taking into account the entire dynamics of the robot that are often ignored in existing reactive steppers based on simplified models. The DeepQ stepper can handle non convex terrain with obstacles, walk on restricted surfaces like stepping stones and recover from external disturbances for a constant computational cost.
ROOct 16, 2020
Robot Learning with Crash ConstraintsAlonso Marco, Dominik Baumann, Majid Khadiv et al.
In the past decade, numerous machine learning algorithms have been shown to successfully learn optimal policies to control real robotic systems. However, it is common to encounter failing behaviors as the learning loop progresses. Specifically, in robot applications where failing is undesired but not catastrophic, many algorithms struggle with leveraging data obtained from failures. This is usually caused by (i) the failed experiment ending prematurely, or (ii) the acquired data being scarce or corrupted. Both complicate the design of proper reward functions to penalize failures. In this paper, we propose a framework that addresses those issues. We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation. The no-data case is addressed by a novel GP model (GPCR) for the constraint that combines discrete events (failure/success) with continuous observations (only obtained upon success). We demonstrate the effectiveness of our framework on simulated benchmarks and on a real jumping quadruped, where the constraint threshold is unknown a priori. Experimental data is collected, by means of constrained Bayesian optimization, directly on the real robot. Our results outperform manual tuning and GPCR proves useful on estimating the constraint threshold.
ROOct 9, 2020
Robust walking based on MPC with viability guaranteesMohammad Hasan Yeganegi, Majid Khadiv, Andrea Del Prete et al.
Model predictive control (MPC) has shown great success for controlling complex systems such as legged robots. However, when closing the loop, the performance and feasibility of the finite horizon optimal control problem (OCP) solved at each control cycle is not guaranteed anymore. This is due to model discrepancies, the effect of low-level controllers, uncertainties and sensor noise. To address these issues, we propose a modified version of a standard MPC approach used in legged locomotion with viability (weak forward invariance) guarantees. In this approach, instead of adding a (conservative) terminal constraint to the problem, we propose to use the measured state projected to the viability kernel in the OCP solved at each control cycle. Moreover, we use past experimental data to find the best cost weights, which measure a combination of performance, constraint satisfaction robustness, or stability (invariance). These interpretable costs measure the trade off between robustness and performance. For this purpose, we use Bayesian optimization (BO) to systematically design experiments that help efficiently collect data to learn a cost function leading to robust performance. Our simulation results with different realistic disturbances (i.e. external pushes, unmodeled actuator dynamics and computational delay) show the effectiveness of our approach to create robust controllers for humanoid robots.
ROOct 2, 2020
Efficient Multi-Contact Pattern Generation with Sequential Convex Approximations of the Centroidal DynamicsBrahayam Ponton, Majid Khadiv, Avadesh Meduri et al.
This paper investigates the problem of efficient computation of physically consistent multi-contact behaviors. Recent work showed that under mild assumptions, the problem could be decomposed into simpler kinematic and centroidal dynamic optimization problems. Based on this approach, we propose a general convex relaxation of the centroidal dynamics leading to two computationally efficient algorithms based on iterative resolutions of second order cone programs. They optimize centroidal trajectories, contact forces and, importantly, the timing of the motions. We include the approach in a kino-dynamic optimization method to generate full-body movements. Finally, the approach is embedded in a mixed-integer solver to further find dynamically consistent contact sequences. Extensive numerical experiments demonstrate the computational efficiency of the approach, suggesting that it could be used in a fast receding horizon control loop. Executions of the planned motions on simulated humanoids and quadrupeds and on a real quadruped robot further show the quality of the optimized motions.
ROAug 19, 2020
Enabling Remote Whole-Body Control with 5G Edge ComputingHuaijiang Zhu, Manali Sharma, Kai Pfeiffer et al.
Real-world applications require light-weight, energy-efficient, fully autonomous robots. Yet, increasing autonomy is oftentimes synonymous with escalating computational requirements. It might thus be desirable to offload intensive computation--not only sensing and planning, but also low-level whole-body control--to remote servers in order to reduce on-board computational needs. Fifth Generation (5G) wireless cellular technology, with its low latency and high bandwidth capabilities, has the potential to unlock cloud-based high performance control of complex robots. However, state-of-the-art control algorithms for legged robots can only tolerate very low control delays, which even ultra-low latency 5G edge computing can sometimes fail to achieve. In this work, we investigate the problem of cloud-based whole-body control of legged robots over a 5G link. We propose a novel approach that consists of a standard optimization-based controller on the network edge and a local linear, approximately optimal controller that significantly reduces on-board computational needs while increasing robustness to delay and possible loss of communication. Simulation experiments on humanoid balancing and walking tasks that includes a realistic 5G communication model demonstrate significant improvement of the reliability of robot locomotion under jitter and delays likely to experienced in 5G wireless links.
SYMay 15, 2020
Stochastic and Robust MPC for Bipedal Locomotion: A Comparative Study on Robustness and PerformanceAhmad Gazar, Majid Khadiv, Andrea Del Prete et al.
Linear Model Predictive Control (MPC) has been successfully used for generating feasible walking motions for humanoid robots. However, the effect of uncertainties on constraints satisfaction has only been studied using Robust MPC (RMPC) approaches, which account for the worst-case realization of bounded disturbances at each time instant. In this letter, we propose for the first time to use linear stochastic MPC (SMPC) to account for uncertainties in bipedal walking. We show that SMPC offers more flexibility to the user (or a high level decision maker) by tolerating small (user-defined) probabilities of constraint violation. Therefore, SMPC can be tuned to achieve a constraint satisfaction probability that is arbitrarily close to 100\%, but without sacrificing performance as much as tube-based RMPC. We compare SMPC against RMPC in terms of robustness (constraint satisfaction) and performance (optimality). Our results highlight the benefits of SMPC and its interest for the robotics community as a powerful mathematical tool for dealing with uncertainties.
ROSep 19, 2019
Robust Humanoid Contact Planning with Learned Zero- and One-Step Capturability PredictionYu-Chi Lin, Ludovic Righetti, Dmitry Berenson
Humanoid robots maintain balance and navigate by controlling the contact wrenches applied to the environment. While it is possible to plan dynamically-feasible motion that applies appropriate wrenches using existing methods, a humanoid may also be affected by external disturbances. Existing systems typically rely on controllers to reactively recover from disturbances. However, such controllers may fail when the robot cannot reach contacts capable of rejecting a given disturbance. In this paper, we propose a search-based footstep planner which aims to maximize the probability of the robot successfully reaching the goal without falling as a result of a disturbance. The planner considers not only the poses of the planned contact sequence, but also alternative contacts near the planned contact sequence that can be used to recover from external disturbances. Although this additional consideration significantly increases the computation load, we train neural networks to efficiently predict multi-contact zero-step and one-step capturability, which allows the planner to generate robust contact sequences efficiently. Our results show that our approach generates footstep sequences that are more robust to external disturbances than a conventional footstep planner in four challenging scenarios.
ROAug 10, 2019
Learning to Explore in Motion and Interaction TasksMiroslav Bogdanovic, Ludovic Righetti
Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.
ROJul 17, 2019
Learning Variable Impedance Control for Contact Sensitive TasksMiroslav Bogdanovic, Majid Khadiv, Ludovic Righetti
Reinforcement learning algorithms have shown great success in solving different problems ranging from playing video games to robotics. However, they struggle to solve delicate robotic problems, especially those involving contact interactions. Though in principle a policy directly outputting joint torques should be able to learn to perform these tasks, in practice we see that it has difficulty to robustly solve the problem without any given structure in the action space. In this paper, we investigate how the choice of action space can give robust performance in presence of contact uncertainties. We propose learning a policy giving as output impedance and desired position in joint space and compare the performance of that approach to torque and position control under different contact uncertainties. Furthermore, we propose an additional reward term designed to regularize these variable impedance control policies, giving them interpretability and facilitating their transfer to real systems. We present extensive experiments in simulation of both floating and fixed-base systems in tasks involving contact uncertainties, as well as results for running the learned policies on a real system.
ROJul 10, 2019
Robust Humanoid Locomotion Using Trajectory Optimization and Sample-Efficient LearningMohammad Hasan Yeganegi, Majid Khadiv, S. Ali A. Moosavian et al.
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to intractable problems. Furthermore, since the models used in TO have always some level of abstraction, it can be hard to find a realistic set of uncertainties in the model space. In this paper we leverage a sample-efficient learning technique (Bayesian optimization) to robustify TO for humanoid locomotion. The main idea is to use data from full-body simulations to make the TO stage robust by tuning the cost weights. To this end, we split the TO problem into two phases. The first phase solves a convex optimization problem for generating center of mass (CoM) trajectories based on simplified linear dynamics. The second stage employs iterative Linear-Quadratic Gaussian (iLQG) as a whole-body controller to generate full body control inputs. Then we use Bayesian optimization to find the cost weights to use in the first stage that yields robust performance in the simulation/experiment, in the presence of different disturbance/uncertainties. The results show that the proposed approach is able to generate robust motions for different sets of disturbances and uncertainties.
ROJun 9, 2019
Trajectory Optimization for Robust Humanoid Locomotion with Sample-Efficient LearningMajid Khadiv, Mohammad Hasan Yeganegi, S. Ali A. Moosavian et al.
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to an interactable problem. Furthermore, since the models used in the TO have always some level of abstraction, it is hard to find a realistic set of uncertainty in the space of abstract model. In this paper we aim at leveraging a sample-efficient learning technique (Bayesian optimization) to robustify trajectory optimization for humanoid locomotion. The main idea is to use Bayesian optimization to find the optimal set of cost weights which compromises performance with respect to robustness with a few realistic simulation/experiment. The results show that the proposed approach is able to generate robust motions for different set of disturbances and uncertainties.
ROApr 15, 2019
Curious iLQR: Resolving Uncertainty in Model-based RLSarah Bechtle, Yixin Lin, Akshara Rai et al.
Curiosity as a means to explore during reinforcement learning problems has recently become very popular. However, very little progress has been made in utilizing curiosity for learning control. In this work, we propose a model-based reinforcement learning (MBRL) framework that combines Bayesian modeling of the system dynamics with curious iLQR, an iterative LQR approach that considers model uncertainty. During trajectory optimization the curious iLQR attempts to minimize both the task-dependent cost and the uncertainty in the dynamics model. We demonstrate the approach on reaching tasks with 7-DoF manipulators in simulation and on a real robot. Our experiments show that MBRL with curious iLQR reaches desired end-effector targets more reliably and with less system rollouts when learning a new task from scratch, and that the learned model generalizes better to new reaching tasks.
ROOct 31, 2018
Efficient Humanoid Contact Planning using Learned Centroidal Dynamics PredictionYu-Chi Lin, Brahayam Ponton, Ludovic Righetti et al.
Humanoid robots dynamically navigate an environment by interacting with it via contact wrenches exerted at intermittent contact poses. Therefore, it is important to consider dynamics when planning a contact sequence. Traditional contact planning approaches assume a quasi-static balance criterion to reduce the computational challenges of selecting a contact sequence over a rough terrain. This however limits the applicability of the approach when dynamic motions are required, such as when walking down a steep slope or crossing a wide gap. Recent methods overcome this limitation with the help of efficient mixed integer convex programming solvers capable of synthesizing dynamic contact sequences. Nevertheless, its exponential-time complexity limits its applicability to short time horizon contact sequences within small environments. In this paper, we go beyond current approaches by learning a prediction of the dynamic evolution of the robot centroidal momenta, which can then be used for quickly generating dynamically robust contact sequences for robots with arms and legs using a search-based contact planner. We demonstrate the efficiency and quality of the results of the proposed approach in a set of dynamically challenging scenarios.
ROSep 19, 2018
Leveraging Contact Forces for Learning to GraspHamza Merzic, Miroslav Bogdanovic, Daniel Kappler et al.
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two-fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.
ROMar 6, 2018
Learning Task-Specific Dynamics to Improve Whole-Body ControlAndrej Gams, Sean A. Mason, Aleš Ude et al.
In task-based inverse dynamics control, reference accelerations used to follow a desired plan can be broken down into feedforward and feedback trajectories. The feedback term accounts for tracking errors that are caused from inaccurate dynamic models or external disturbances. On underactuated, free-floating robots, such as humanoids, good tracking accuracy often necessitates high feedback gains, which leads to undesirable stiff behaviors. The magnitude of these gains is anyways often strongly limited by the control bandwidth. In this paper, we show how to reduce the required contribution of the feedback controller by incorporating learned task-space reference accelerations. Thus, we i) improve the execution of the given specific task, and ii) offer the means to reduce feedback gains, providing for greater compliance of the system. %With a systematic approach we also reduce heuristic tuning of the model parameters and feedback gains, often present in real-world experiments. In contrast to learning task-specific joint-torques, which might produce a similar effect but can lead to poor generalization, our approach directly learns the task-space dynamics of the center of mass of a humanoid robot. Simulated and real-world results on the lower part of the Sarcos Hermes humanoid robot demonstrate the applicability of the approach.
RODec 26, 2017
An MPC Walking Framework With External Contact ForcesSean Mason, Nicholas Rotella, Stefan Schaal et al.
In this work, we present an extension to a linear Model Predictive Control (MPC) scheme that plans external contact forces for the robot when given multiple contact locations and their corresponding friction cone. To this end, we set up a two-step optimization problem. In the first optimization, we compute the Center of Mass (CoM) trajectory, foot step locations, and introduce slack variables to account for violating the imposed constraints on the Zero Moment Point (ZMP). We then use the slack variables to trigger the second optimization, in which we calculate the optimal external force that compensates for the ZMP tracking error. This optimization considers multiple contacts positions within the environment by formulating the problem as a Mixed Integer Quadratic Program (MIQP) that can be solved at a speed between 100-300 Hz. Once contact is created, the MIQP reduces to a single Quadratic Program (QP) that can be solved in real-time ($<$ 1kHz). Simulations show that the presented walking control scheme can withstand disturbances 2-3x larger with the additional force provided by a hand contact.
ROSep 29, 2017
Learning a Structured Neural Network Policy for a Hopping TaskJulian Viereck, Jules Kozolinsky, Alexander Herzog et al.
In this work we present a method for learning a reactive policy for a simple dynamic locomotion task involving hard impact and switching contacts where we assume the contact location and contact timing to be unknown. To learn such a policy, we use optimal control to optimize a local controller for a fixed environment and contacts. We learn the contact-rich dynamics for our underactuated systems along these trajectories in a sample efficient manner. We use the optimized policies to learn the reactive policy in form of a neural network. Using a new neural network architecture, we are able to preserve more information from the local policy and make its output interpretable in the sense that its output in terms of desired trajectories, feedforward commands and gains can be interpreted. Extensive simulations demonstrate the robustness of the approach to changing environments, outperforming a model-free gradient policy based methods on the same tasks in simulation. Finally, we show that the learned policy can be robustly transferred on a real robot.
ROSep 26, 2017
On Time Optimization of Centroidal Momentum DynamicsBrahayam Ponton, Alexander Herzog, Andrea Del Prete et al.
Recently, the centroidal momentum dynamics has received substantial attention to plan dynamically consistent motions for robots with arms and legs in multi-contact scenarios. However, it is also non convex which renders any optimization approach difficult and timing is usually kept fixed in most trajectory optimization techniques to not introduce additional non convexities to the problem. But this can limit the versatility of the algorithms. In our previous work, we proposed a convex relaxation of the problem that allowed to efficiently compute momentum trajectories and contact forces. However, our approach could not minimize a desired angular momentum objective which seriously limited its applicability. Noticing that the non-convexity introduced by the time variables is of similar nature as the centroidal dynamics one, we propose two convex relaxations to the problem based on trust regions and soft constraints. The resulting approaches can compute time-optimized dynamically consistent trajectories sufficiently fast to make the approach realtime capable. The performance of the algorithm is demonstrated in several multi-contact scenarios for a humanoid robot. In particular, we show that the proposed convex relaxation of the original problem finds solutions that are consistent with the original non-convex problem and illustrate how timing optimization allows to find motion plans that would be difficult to plan with fixed timing.
ROSep 21, 2017
Unsupervised Contact Learning for Humanoid Estimation and ControlNicholas Rotella, Stefan Schaal, Ludovic Righetti
This work presents a method for contact state estimation using fuzzy clustering to learn contact probability for full, six-dimensional humanoid contacts. The data required for training is solely from proprioceptive sensors - endeffector contact wrench sensors and inertial measurement units (IMUs) - and the method is completely unsupervised. The resulting cluster means are used to efficiently compute the probability of contact in each of the six endeffector degrees of freedom (DoFs) independently. This clustering-based contact probability estimator is validated in a kinematics-based base state estimator in a simulation environment with realistic added sensor noise for locomotion over rough, low-friction terrain on which the robot is subject to foot slip and rotation. The proposed base state estimator which utilizes these six DoF contact probability estimates is shown to perform considerably better than that which determines kinematic contact constraints purely based on measured normal force.
ROAug 6, 2017
Pattern Generation for Walking on Slippery TerrainsMajid Khadiv, S. Ali A. Moosavian, Alexander Herzog et al.
In this paper, we extend state of the art Model Predictive Control (MPC) approaches to generate safe bipedal walking on slippery surfaces. In this setting, we formulate walking as a trade off between realizing a desired walking velocity and preserving robust foot-ground contact. Exploiting this formulation inside MPC, we show that safe walking on various flat terrains can be achieved by compromising three main attributes, i. e. walking velocity tracking, the Zero Moment Point (ZMP) modulation, and the Required Coefficient of Friction (RCoF) regulation. Simulation results show that increasing the walking velocity increases the possibility of slippage, while reducing the slippage possibility conflicts with reducing the tip-over possibility of the contact and vice versa.
ROApr 5, 2017
Walking Control Based on Step Timing AdaptationMajid Khadiv, Alexander Herzog, S. Ali A. Moosavian et al.
Step adjustment can improve the gait robustness of biped robots, however the adaptation of step timing is often neglected as it gives rise to non-convex problems when optimized over several footsteps. In this paper, we argue that it is not necessary to optimize walking over several steps to ensure gait viability and show that it is sufficient to merely select the next step timing and location. Using this insight, we propose a novel walking pattern generator that optimally selects step location and timing at every control cycle. Our approach is computationally simple compared to standard approaches in the literature, yet guarantees that any viable state will remain viable in the future. We propose a swing foot adaptation strategy and integrate the pattern generator with an inverse dynamics controller that does not explicitly control the center of mass nor the foot center of pressure. This is particularly useful for biped robots with limited control authority over their foot center of pressure, such as robots with point feet or passive ankles. Extensive simulations on a humanoid robot with passive ankles demonstrate the capabilities of the approach in various walking situations, including external pushes and foot slippage, and emphasize the importance of step timing adaptation to stabilize walking.
ROJan 27, 2017
Balancing and Walking Using Full Dynamics LQR Control With Contact ConstraintsSean Mason, Nicholas Rotella, Stefan Schaal et al.
Torque control algorithms which consider robot dynamics and contact constraints are important for creating dynamic behaviors for humanoids. As computational power increases, algorithms tend to also increase in complexity. However, it is not clear how much complexity is really required to create controllers which exhibit good performance. In this paper, we study the capabilities of a simple approach based on contact consistent LQR controllers designed around key poses to control various tasks on a humanoid robot. We present extensive experimental results on a hydraulic, torque controlled humanoid performing balancing and stepping tasks. This feedback control approach captures the necessary synergies between the DoFs of the robot to guarantee good control performance. We show that for the considered tasks, it is only necessary to re-linearize the dynamics of the robot at different contact configurations and that increasing the number of LQR controllers along desired trajectories does not improve performance. Our result suggest that very simple controllers can yield good performance competitive with current state of the art, but more complex, optimization-based whole-body controllers. A video of the experiments can be found at https://youtu.be/5T08CNKV1hw.
ROOct 7, 2016
Step Timing Adjustment: A Step toward Generating Robust GaitsMajid Khadiv, Alexander Herzog, S. Ali. A. Moosavian et al.
Step adjustment for humanoid robots has been shown to improve robustness in gaits. However, step duration adaptation is often neglected in control strategies. In this paper, we propose an approach that combines both step location and timing adjustment for generating robust gaits. In this approach, step location and step timing are decided, based on feedback from the current state of the robot. The proposed approach is comprised of two stages. In the first stage, the nominal step location and step duration for the next step or a previewed number of steps are specified. In this stage which is done at the start of each step, the main goal is to specify the best step length and step duration for a desired walking speed. The second stage deals with finding the best landing point and landing time of the swing foot at each control cycle. In this stage, stability of the gaits is preserved by specifying a desired offset between the swing foot landing point and the Divergent Component of Motion (DCM) at the end of current step. After specifying the landing point of the swing foot at a desired time, the swing foot trajectory is regenerated at each control cycle to realize desired landing properties. Simulation on different scenarios shows the robustness of the generated gaits from our proposed approach compared to the case where no timing adjustment is employed.