Wolfgang Merkt

28papers

783citations

Novelty52%

AI Score31

Ranked #143,295 of 205,806 authors (top 70%)#4,761 in RO (top 63%)

28 Papers

ROMay 2, 2022

VAE-Loco: Versatile Quadruped Locomotion by Learning a Disentangled Gait Representation

Alexander L. Mitchell, Wolfgang Merkt, Mathieu Geisert et al. · deepmind

Quadruped locomotion is rapidly maturing to a degree where robots are able to realise highly dynamic manoeuvres. However, current planners are unable to vary key gait parameters of the in-swing feet midair. In this work we address this limitation and show that it is pivotal in increasing controller robustness by learning a latent space capturing the key stance phases constituting a particular gait. This is achieved via a generative model trained on a single trot style, which encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. We demonstrate that specific properties of the drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. Due to the nature of our approach these synthesised gaits are continuously variable online during robot operation. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on two versions of the real ANYmal quadruped robots and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations.

ROMar 14, 2022

Agile Maneuvers in Legged Robots: a Predictive Control Approach

Carlos Mastalli, Wolfgang Merkt, Guiyang Xin et al.

Planning and execution of agile locomotion maneuvers have been a longstanding challenge in legged robotics. It requires to derive motion plans and local feedback policies in real-time to handle the nonholonomy of the kinetic momenta. To achieve so, we propose a hybrid predictive controller that considers the robot's actuation limits and full-body dynamics. It combines the feedback policies with tactile information to locally predict future actions. It converges within a few milliseconds thanks to a feasibility-driven approach. Our predictive controller enables ANYmal robots to generate agile maneuvers in realistic scenarios. A crucial element is to track the local feedback policies as, in contrast to whole-body control, they achieve the desired angular momentum. To the best of our knowledge, our predictive controller is the first to handle actuation limits, generate agile locomotion maneuvers, and execute optimal feedback policies for low level torque control without the use of a separate whole-body controller.

ROSep 26, 2022

Learning and Deploying Robust Locomotion Policies with Minimal Dynamics Randomization

Luigi Campanaro, Siddhant Gangapurwala, Wolfgang Merkt et al.

Training deep reinforcement learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behaviour. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, exhaustively engineered approaches such as system identification, dynamics randomization, and domain adaptation are generally employed. As an alternative, we investigate a simple strategy of random force injection (RFI) to perturb system dynamics during training. We show that the application of random forces enables us to emulate dynamics randomization. This allows us to obtain locomotion policies that are robust to variations in system dynamics. We further extend RFI, referred to as extended random force injection (ERFI), by introducing an episodic actuation offset. We demonstrate that ERFI provides additional robustness for variations in system mass offering on average a 53% improved performance over RFI. We also show that ERFI is sufficient to perform a successful sim-to-real transfer on two different quadrupedal platforms, ANYmal C and Unitree A1, even for perceptive locomotion over uneven terrain in outdoor environments.

AIOct 3, 2022

Multi-Agent Chance-Constrained Stochastic Shortest Path with Application to Risk-Aware Intelligent Intersection

Majid Khonji, Rashid Alyassi, Wolfgang Merkt et al.

In transportation networks, where traffic lights have traditionally been used for vehicle coordination, intersections act as natural bottlenecks. A formidable challenge for existing automated intersections lies in detecting and reasoning about uncertainty from the operating environment and human-driven vehicles. In this paper, we propose a risk-aware intelligent intersection system for autonomous vehicles (AVs) as well as human-driven vehicles (HVs). We cast the problem as a novel class of Multi-agent Chance-Constrained Stochastic Shortest Path (MCC-SSP) problems and devise an exact Integer Linear Programming (ILP) formulation that is scalable in the number of agents' interaction points (e.g., potential collision points at the intersection). In particular, when the number of agents within an interaction point is small, which is often the case in intersections, the ILP has a polynomial number of variables and constraints. To further improve the running time performance, we show that the collision risk computation can be performed offline. Additionally, a trajectory optimization workflow is provided to generate risk-aware trajectories for any given intersection. The proposed framework is implemented in CARLA simulator and evaluated under a fully autonomous intersection with AVs only as well as in a hybrid setup with a signalized intersection for HVs and an intelligent scheme for AVs. As verified via simulations, the featured approach improves intersection's efficiency by up to $200\%$ while also conforming to the specified tunable risk threshold.

ROApr 25, 2023

Roll-Drop: accounting for observation noise with a single parameter

Luigi Campanaro, Daniele De Martini, Siddhant Gangapurwala et al.

This paper proposes a simple strategy for sim-to-real in Deep-Reinforcement Learning (DRL) -- called Roll-Drop -- that uses dropout during simulation to account for observation noise during deployment without explicitly modelling its distribution for each state. DRL is a promising approach to control robots for highly dynamic and feedback-based manoeuvres, and accurate simulators are crucial to providing cheap and abundant data to learn the desired behaviour. Nevertheless, the simulated data are noiseless and generally show a distributional shift that challenges the deployment on real machines where sensor readings are affected by noise. The standard solution is modelling the latter and injecting it during training; while this requires a thorough system identification, Roll-Drop enhances the robustness to sensor noise by tuning only a single parameter. We demonstrate an 80% success rate when up to 25% noise is injected in the observations, with twice higher robustness than the baselines. We deploy the controller trained in simulation on a Unitree A1 platform and assess this improved robustness on the physical system.

ROSep 11, 2019Code

Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control

Carlos Mastalli, Rohan Budhiraja, Wolfgang Merkt et al.

We introduce Crocoddyl (Contact RObot COntrol by Differential DYnamic Library), an open-source framework tailored for efficient multi-contact optimal control. Crocoddyl efficiently computes the state trajectory and the control policy for a given predefined sequence of contacts. Its efficiency is due to the use of sparse analytical derivatives, exploitation of the problem structure, and data sharing. It employs differential geometry to properly describe the state of any geometrical system, e.g. floating-base systems. Additionally, we propose a novel optimal control algorithm called Feasibility-driven Differential Dynamic Programming (FDDP). Our method does not add extra decision variables which often increases the computation time per iteration due to factorization. FDDP shows a greater globalization strategy compared to classical Differential Dynamic Programming (DDP) algorithms. Concretely, we propose two modifications to the classical DDP algorithm. First, the backward pass accepts infeasible state-control trajectories. Second, the rollout keeps the gaps open during the early "exploratory" iterations (as expected in multiple-shooting methods with only equality constraints). We showcase the performance of our framework using different tasks. With our method, we can compute highly-dynamic maneuvers (e.g. jumping, front-flip) within few milliseconds.

ROJun 20, 2024

Adaptive Manipulation using Behavior Trees

Jacques Cloete, Wolfgang Merkt, Ioannis Havoutis

Many manipulation tasks pose a challenge since they depend on non-visual environmental information that can only be determined after sustained physical interaction has already begun. This is particularly relevant for effort-sensitive, dynamics-dependent tasks such as tightening a valve. To perform these tasks safely and reliably, robots must be able to quickly adapt in response to unexpected changes during task execution, and should also learn from past experience to better inform future decisions. Humans can intuitively respond and adapt their manipulation strategy to suit such problems, but representing and implementing such behaviors for robots remains a challenge. In this work we show how this can be achieved within the framework of behavior trees. We present the adaptive behavior tree, a scalable and generalizable behavior tree design that enables a robot to quickly adapt to and learn from both visual and non-visual observations during task execution, preempting task failure or switching to a different manipulation strategy. The adaptive behavior tree selects the manipulation strategy that is predicted to optimize task performance, and learns from past experience to improve these predictions for future attempts. We test our approach on a variety of tasks commonly found in industry; the adaptive behavior tree demonstrates safety, robustness (100% success rate) and efficiency in task completion (up to 36% task speedup from the baseline).

ROJan 13, 2022

Motion Planning in Dynamic Environments Using Context-Aware Human Trajectory Prediction

Mark Nicholas Finean, Luka Petrović, Wolfgang Merkt et al.

Over the years, the separate fields of motion planning, mapping, and human trajectory prediction have advanced considerably. However, the literature is still sparse in providing practical frameworks that enable mobile manipulators to perform whole-body movements and account for the predicted motion of moving obstacles. Previous optimisation-based motion planning approaches that use distance fields have suffered from the high computational cost required to update the environment representation. We demonstrate that GPU-accelerated predicted composite distance fields significantly reduce the computation time compared to calculating distance fields from scratch. We integrate this technique with a complete motion planning and perception framework that accounts for the predicted motion of humans in dynamic environments, enabling reactive and pre-emptive motion planning that incorporates predicted motions. To achieve this, we propose and implement a novel human trajectory prediction method that combines intention recognition with trajectory optimisation-based motion planning. We validate our resultant framework on a real-world Toyota Human Support Robot (HSR) using live RGB-D sensor data from the onboard camera. In addition to providing analysis on a publicly available dataset, we release the Oxford Indoor Human Motion (Oxford-IHM) dataset and demonstrate state-of-the-art performance in human trajectory prediction. The Oxford-IHM dataset is a human trajectory prediction dataset in which people walk between regions of interest in an indoor environment. Both static and robot-mounted RGB-D cameras observe the people while tracked with a motion-capture system.

RODec 9, 2021

Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion

Alexander L. Mitchell, Wolfgang Merkt, Mathieu Geisert et al.

Quadruped locomotion is rapidly maturing to a degree where robots now routinely traverse a variety of unstructured terrains. However, while gaits can be varied typically by selecting from a range of pre-computed styles, current planners are unable to vary key gait parameters continuously while the robot is in motion. The synthesis, on-the-fly, of gaits with unexpected operational characteristics or even the blending of dynamic manoeuvres lies beyond the capabilities of the current state-of-the-art. In this work we address this limitation by learning a latent space capturing the key stance phases of a particular gait, via a generative model trained on a single trot style. This encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. In fact properties of this drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on a real ANYmal quadruped robot and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations.

ROSep 10, 2021

Where Should I Look? Optimised Gaze Control for Whole-Body Collision Avoidance in Dynamic Environments

Mark Nicholas Finean, Wolfgang Merkt, Ioannis Havoutis

As robots operate in increasingly complex and dynamic environments, fast motion re-planning has become a widely explored area of research. In a real-world deployment, we often lack the ability to fully observe the environment at all times, giving rise to the challenge of determining how to best perceive the environment given a continuously updated motion plan. We provide the first investigation into a `smart' controller for gaze control with the objective of providing effective perception of the environment for obstacle avoidance and motion planning in dynamic and unknown environments. We detail the novel problem of determining the best head camera behaviour for mobile robots when constrained by a trajectory. Furthermore, we propose a greedy optimisation-based solution that uses a combination of voxelised rewards and motion primitives. We demonstrate that our method outperforms the benchmark methods in 2D and 3D environments, in respect of both the ability to explore the local surroundings, as well as in a superior success rate of finding collision-free trajectories -- our method is shown to provide 7.4x better map exploration while consistently achieving a higher success rate for generating collision-free trajectories. We verify our findings on a physical Toyota Human Support Robot (HSR) using a GPU-accelerated perception framework.

ROAug 4, 2021

Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

Paarth Shah, Avadesh Meduri, Wolfgang Merkt et al.

In this paper we explore the use of block coordinate descent (BCD) to optimize the centroidal momentum dynamics for dynamically consistent multi-contact behaviors. The centroidal dynamics have recently received a large amount of attention in order to create physically realizable motions for robots with hands and feet while being computationally more tractable than full rigid body dynamics models. Our contribution lies in exploiting the structure of the dynamics in order to simplify the original non-convex problem into two convex subproblems. We iterate between these two subproblems for a set number of iterations or until a consensus is reached. We explore the properties of the proposed optimization method for the centroidal dynamics and verify in simulation that motions generated by our approach can be tracked by the quadruped Solo12. In addition, we compare our method to a recently proposed convexification using a sequence of convex relaxations as well as a more standard interior point method used in the off- the-shelf solver IPOPT to show that our approach finds similar, if not better, trajectories (in terms of cost), and is more than four times faster than both approaches. Finally, compared to previous approaches, we note its practicality due to the convex nature of each subproblem which allows our method to be used with any off-the-shelf quadratic programming solver.

ROJun 20, 2021

HapFIC: An Adaptive Force/Position Controller for Safe Environment Interaction in Articulated Systems

Carlo Tiseo, Wolfgang Merkt, Keyhan Kouhkiloui Babarahmati et al.

Haptic interaction is essential for the dynamic dexterity of animals, which seamlessly switch from an impedance to an admittance behaviour using the force feedback from their proprioception. However, this ability is extremely challenging to reproduce in robots, especially when dealing with complex interaction dynamics, distributed contacts, and contact switching. Current model-based controllers require accurate interaction modelling to account for contacts and stabilise the interaction. In this manuscript, we propose an adaptive force/position controller that exploits the fractal impedance controller's passivity and non-linearity to execute a finite search algorithm using the force feedback signal from the sensor at the end-effector. The method is computationally inexpensive, opening the possibility to deal with distributed contacts in the future. We evaluated the architecture in physics simulation and showed that the controller can robustly control the interaction with objects of different dynamics without violating the maximum allowable target forces or causing numerical instability even for very rigid objects. The proposed controller can also autonomously deal with contact switching and may find application in multiple fields such as legged locomotion, rehabilitation and assistive robotics.

ROMar 5, 2021

Simultaneous Scene Reconstruction and Whole-Body Motion Planning for Safe Operation in Dynamic Environments

Mark Nicholas Finean, Wolfgang Merkt, Ioannis Havoutis

Recent work has demonstrated real-time mapping and reconstruction from dense perception, while motion planning based on distance fields has been shown to achieve fast, collision-free motion synthesis with good convergence properties. However, demonstration of a fully integrated system that can safely re-plan in unknown environments, in the presence of static and dynamic obstacles, has remained an open challenge. In this work, we first study the impact that signed and unsigned distance fields have on optimisation convergence, and the resultant error cost in trajectory optimisation problems in 2D path planning, arm manipulator motion planning, and whole-body loco-manipulation planning. We further analyse the performance of three state-of-the-art approaches to generating distance fields (Voxblox, Fiesta, and GPU-Voxels) for use in real-time environment reconstruction. Finally, we use our findings to construct a practical hybrid mapping and motion planning system which uses GPU-Voxels and GPMP2 to perform receding-horizon whole-body motion planning that can smoothly avoid moving obstacles in 3D space using live sensor data. Our results are validated in simulation and on a real-world Toyota Human Support Robot (HSR).

ROFeb 25, 2021

CPG-ACTOR: Reinforcement Learning for Central Pattern Generators

Luigi Campanaro, Siddhant Gangapurwala, Daniele De Martini et al.

Central Pattern Generators (CPGs) have several properties desirable for locomotion: they generate smooth trajectories, are robust to perturbations and are simple to implement. Although conceptually promising, we argue that the full potential of CPGs has so far been limited by insufficient sensory-feedback information. This paper proposes a new methodology that allows tuning CPG controllers through gradient-based optimization in a Reinforcement Learning (RL) setting. To the best of our knowledge, this is the first time CPGs have been trained in conjunction with a MultilayerPerceptron (MLP) network in a Deep-RL context. In particular, we show how CPGs can directly be integrated as the Actor in an Actor-Critic formulation. Additionally, we demonstrate how this change permits us to integrate highly non-linear feedback directly from sensory perception to reshape the oscillators' dynamics. Our results on a locomotion task using a single-leg hopper demonstrate that explicitly using the CPG as the Actor rather than as part of the environment results in a significant increase in the reward gained over time (6x more) compared with previous approaches. Furthermore, we show that our method without feedback reproduces results similar to prior work with feedback. Finally, we demonstrate how our closed-loop CPG progressively improves the hopping behaviour for longer training epochs relying only on basic reward functions.

RONov 14, 2020

Sparsity-Inducing Optimal Control via Differential Dynamic Programming

Traiko Dinev, Wolfgang Merkt, Vladimir Ivan et al.

Optimal control is a popular approach to synthesize highly dynamic motion. Commonly, $L_2$ regularization is used on the control inputs in order to minimize energy used and to ensure smoothness of the control inputs. However, for some systems, such as satellites, the control needs to be applied in sparse bursts due to how the propulsion system operates. In this paper, we study approaches to induce sparsity in optimal control solutions -- namely via smooth $L_1$ and Huber regularization penalties. We apply these loss terms to state-of-the-art DDP-based solvers to create a family of sparsity-inducing optimal control methods. We analyze and compare the effect of the different losses on inducing sparsity, their numerical conditioning, their impact on convergence, and discuss hyperparameter settings. We demonstrate our method in simulation and hardware experiments on canonical dynamics systems, control of satellites, and the NASA Valkyrie humanoid robot. We provide an implementation of our method and all examples for reproducibility on GitHub.

RONov 1, 2020

A Passive Navigation Planning Algorithm for Collision-free Control of Mobile Robots

Carlo Tiseo, Vladimir Ivan, Wolfgang Merkt et al.

Path planning and collision avoidance are challenging in complex and highly variable environments due to the limited horizon of events. In literature, there are multiple model- and learning-based approaches that require significant computational resources to be effectively deployed and they may have limited generality. We propose a planning algorithm based on a globally stable passive controller that can plan smooth trajectories using limited computational resources in challenging environmental conditions. The architecture combines the recently proposed fractal impedance controller with elastic bands and regions of finite time invariance. As the method is based on an impedance controller, it can also be used directly as a force/torque controller. We validated our method in simulation to analyse the ability of interactive navigation in challenging concave domains via the issuing of via-points, and its robustness to low bandwidth feedback. A swarm simulation using 11 agents validated the scalability of the proposed method. We have performed hardware experiments on a holonomic wheeled platform validating smoothness and robustness of interaction with dynamic agents (i.e., humans and robots). The computational complexity of the proposed local planner enables deployment with low-power micro-controllers lowering the energy consumption compared to other methods that rely upon numeric optimisation.

ROOct 11, 2020

Inverse Dynamics vs. Forward Dynamics in Direct Transcription Formulations for Trajectory Optimization

Henrique Ferrolho, Vladimir Ivan, Wolfgang Merkt et al.

Benchmarks of state-of-the-art rigid-body dynamics libraries report better performance solving the inverse dynamics problem than the forward alternative. Those benchmarks encouraged us to question whether that computational advantage would translate to direct transcription, where calculating rigid-body dynamics and their derivatives accounts for a significant share of computation time. In this work, we implement an optimization framework where both approaches for enforcing the system dynamics are available. We evaluate the performance of each approach for systems of varying complexity, for domains with rigid contacts. Our tests reveal that formulations using inverse dynamics converge faster, require less iterations, and are more robust to coarse problem discretization. These results indicate that inverse dynamics should be preferred to enforce the nonlinear system dynamics in simultaneous methods, such as direct transcription.

ROOct 2, 2020

Memory Clustering using Persistent Homology for Multimodality- and Discontinuity-Sensitive Learning of Optimal Control Warm-starts

Wolfgang Merkt, Vladimir Ivan, Traiko Dinev et al.

Shooting methods are an efficient approach to solving nonlinear optimal control problems. As they use local optimization, they exhibit favorable convergence when initialized with a good warm-start but may not converge at all if provided with a poor initial guess. Recent work has focused on providing an initial guess from a learned model trained on samples generated during an offline exploration of the problem space. However, in practice the solutions contain discontinuities introduced by system dynamics or the environment. Additionally, in many cases multiple equally suitable, i.e., multi-modal, solutions exist to solve a problem. Classic learning approaches smooth across the boundary of these discontinuities and thus generalize poorly. In this work, we apply tools from algebraic topology to extract information on the underlying structure of the solution space. In particular, we introduce a method based on persistent homology to automatically cluster the dataset of precomputed solutions to obtain different candidate initial guesses. We then train a Mixture-of-Experts within each cluster to predict state and control trajectories to warm-start the optimal control solver and provide a comparison with modality-agnostic learning. We demonstrate our method on a cart-pole toy problem and a quadrotor avoiding obstacles, and show that clustering samples based on inherent structure improves the warm-start quality.

ROOct 1, 2020

A Feasibility-Driven Approach to Control-Limited DDP

Carlos Mastalli, Wolfgang Merkt, Josep Marti-Saumell et al.

Differential dynamic programming (DDP) is a direct single shooting method for trajectory optimization. Its efficiency derives from the exploitation of temporal structure (inherent to optimal control problems) and explicit roll-out/integration of the system dynamics. However, it suffers from numerical instability and, when compared to direct multiple shooting methods, it has limited initialization options (allows initialization of controls, but not of states) and lacks proper handling of control constraints. In this work, we tackle these issues with a feasibility-driven approach that regulates the dynamic feasibility during the numerical optimization and ensures control limits. Our feasibility search emulates the numerical resolution of a direct multiple shooting problem with only dynamics constraints. We show that our approach (named BOX-FDDP) has better numerical convergence than BOX-DDP+ (a single shooting method), and that its convergence rate and runtime performance are competitive with state-of-the-art direct transcription formulations solved using the interior point and active set algorithms available in KNITRO. We further show that BOX-FDDP decreases the dynamic feasibility error monotonically--as in state-of-the-art nonlinear programming algorithms. We demonstrate the benefits of our approach by generating complex and athletic motions for quadruped and humanoid robots. Finally, we highlight that BOX-FDDP is suitable for model predictive control in legged robots.

ROAug 3, 2020

Predicted Composite Signed-Distance Fields for Real-Time Motion Planning in Dynamic Environments

Mark Nicholas Finean, Wolfgang Merkt, Ioannis Havoutis

We present a novel framework for motion planning in dynamic environments that accounts for the predicted trajectories of moving objects in the scene. We explore the use of composite signed-distance fields in motion planning and detail how they can be used to generate signed-distance fields (SDFs) in real-time to incorporate predicted obstacle motions. We benchmark our approach of using composite SDFs against performing exact SDF calculations on the workspace occupancy grid. Our proposed technique generates predictions substantially faster and typically exhibits an 81--97% reduction in time for subsequent predictions. We integrate our framework with GPMP2 to demonstrate a full implementation of our approach in real-time, enabling a 7-DoF Panda arm to smoothly avoid a moving robot.

ROMar 3, 2020

Bio-mimetic Adaptive Force/Position Control Using Fractal Impedance

Carlo Tiseo, Wolfgang Merkt, Keyhan Kouhkiloui Babarahmati et al.

The ability of animals to interact with complex dynamics is unmatched in robots. Especially important to the interaction performances is the online adaptation of body dynamics, which can be modeled as an impedance behaviour. However, the variable impedance controller still possesses a challenge in the current control frameworks due to the difficulties of retaining stability when adapting the controller gains. The fractal impedance controller has been recently proposed to solve this issue. However, it still has limitations such as sudden jumps in force when it starts to converge to the desired position and the lack of a force feedback loop. In this manuscript, two improvements are made to the control framework to solve these limitations. The force discontinuity has been addressed introducing a modulation of the impedance via a virtual antagonist that modulates the output force. The force tracking has been modeled after the parallel force/position controller architecture. In contrast to traditional methods, the fractal impedance controller enables the implementation of a search algorithm on the force feedback to adapt its behaviour on the external environment instead of on relying on \textit{a priori} knowledge of the external dynamics. Preliminary simulation results presented in this paper show the feasibility of the proposed approach, and it allows to evaluate the trade-off that needs to be made when relying on the proposed controller for interaction. In conclusion, the proposed method mimics the behaviour of an agonist/antagonist system adapting to unknown external dynamics, and it may find application in computational neuroscience, haptics, and interaction control.

ROMar 3, 2020

Modeling and Control of a Hybrid Wheeled Jumping Robot

Traiko Dinev, Songyan Xin, Wolfgang Merkt et al.

In this paper, we study a wheeled robot with a prismatic extension joint. This allows the robot to build up momentum to perform jumps over obstacles and to swing up to the upright position after the loss of balance. We propose a template model for the class of such two-wheeled jumping robots. This model can be considered as the simplest wheeled-legged system. We provide an analytical derivation of the system dynamics which we use inside a model predictive controller (MPC). We study the behavior of the model and demonstrate highly dynamic motions such as swing-up and jumping. Furthermore, these motions are discovered through optimization from first principles. We evaluate the controller on a variety of tasks and uneven terrains in a simulator.

ROMar 1, 2020

Optimizing Dynamic Trajectories for Robustness to Disturbances Using Polytopic Projections

Henrique Ferrolho, Wolfgang Merkt, Vladimir Ivan et al.

This paper focuses on robustness to disturbance forces and uncertain payloads. We present a novel formulation to optimize the robustness of dynamic trajectories. A straightforward transcription of this formulation into a nonlinear programming problem is not tractable for state-of-the-art solvers, but it is possible to overcome this complication by exploiting the structure induced by the kinematics of the robot. The non-trivial transcription proposed allows trajectory optimization frameworks to converge to highly robust dynamic solutions. We demonstrate the results of our approach using a quadruped robot equipped with a manipulator.

ROFeb 27, 2020

Safe and Compliant Control of Redundant Robots Using Superimposition of Passive Task-Space Controllers

Carlo Tiseo, Wolfgang Merkt, Wouter Wolfslag et al.

Safe and compliant control of dynamic systems in interaction with the environment, e.g., in shared workspaces, continues to represent a major challenge. Mismatches in the dynamic model of the robots, numerical singularities, and the intrinsic environmental unpredictability are all contributing factors. Online optimization of impedance controllers has recently shown great promise in addressing this challenge, however, their performance is not sufficiently robust to be deployed in challenging environments. This work proposes a compliant control method for redundant manipulators based on a superimposition of multiple passive task-space controllers in a hierarchy. Our control framework of passive controllers is inherently stable, numerically well-conditioned (as no matrix inversions are required), and computationally inexpensive (as no optimization is used). We leverage and introduce a novel stiffness profile for a recently proposed passive controller with smooth transitions between the divergence and convergence phases making it particularly suitable when multiple passive controllers are combined through superimposition. Our experimental results demonstrate that the proposed method achieves sub-centimeter tracking performance during demanding dynamic tasks with fast-changing references, while remaining safe to interact with and robust to singularities. he proposed framework achieves such results without knowledge of the robot dynamics and thanks to its passivity is intrinsically stable. The data further show that the robot can fully take advantage of the redundancy to maintain the primary task accuracy while compensating for unknown environmental interactions, which is not possible from current frameworks that require accurate contact information.

ROFeb 7, 2020

Learning Whole-body Motor Skills for Humanoids

Chuanyu Yang, Kai Yuan, Wolfgang Merkt et al.

This paper presents a hierarchical framework for Deep Reinforcement Learning that acquires motor skills for a variety of push recovery and balancing behaviors, i.e., ankle, hip, foot tilting, and stepping strategies. The policy is trained in a physics simulator with realistic setting of robot model and low-level impedance control that are easy to transfer the learned skills to real robots. The advantage over traditional methods is the integration of high-level planner and feedback control all in one single coherent policy network, which is generic for learning versatile balancing and recovery motions against unknown perturbations at arbitrary locations (e.g., legs, torso). Furthermore, the proposed framework allows the policy to be learned quickly by many state-of-the-art learning algorithms. By comparing our learned results to studies of preprogrammed, special-purpose controllers in the literature, self-learned skills are comparable in terms of disturbance rejection but with additional advantages of producing a wide range of adaptive, versatile and robust behaviors.

ROAug 15, 2019

Residual Force Polytope: Admissible Task-Space Forces of Dynamic Trajectories

Henrique Ferrolho, Wolfgang Merkt, Carlo Tiseo et al.

We propose a representation for the set of forces a robot can counteract using full system dynamics: the residual force polytope. Given the nominal torques required by a dynamic motion, this representation models the forces which can be sustained without interfering with that motion. The residual force polytope can be used to analyze and compare the set of admissible forces of different trajectories, but it can also be used to define metrics for solving optimization problems, such as in trajectory optimization or system design. We demonstrate how such a metric can be applied to trajectory optimization and compare it against other objective functions typically used. Our results show that the trajectories computed by optimizing objectives defined as functions of the residual force polytope are more robust to unknown external disturbances. The computational cost of these metrics is relatively high and not compatible with the short planning times required by online methods, but they are acceptable for planning motions offline.

ROMay 11, 2019

Comparing Alternate Modes of Teleoperation for Constrained Tasks

Christopher E. Mower, Wolfgang Merkt, Aled Davies et al.

Teleoperation of heavy machinery in industry often requires operators to be in close proximity to the plant and issue commands on a per-actuator level using joystick input devices. However, this is non-intuitive and makes achieving desired job properties a challenging task requiring operators to complete extensive and costly training. Despite this, operator fatigue is common with implications for personal safety, project timeliness, cost, and quality. While full automation is not yet achievable due to unpredictability and the dynamic nature of the environment and task, shared control paradigms allow operators to issue high-level commands in an intuitive, task-informed control space while having the robot optimize for achieving desired job properties. In this paper, we compare a number of modes of teleoperation, exploring both the number of dimensions of the control input as well as the most intuitive control spaces. Our experimental evaluations of the performance metrics were based on quantifying the difficulty of tasks based on the well known Fitts' law as well as a measure of how well constraints affecting the task performance were met. Our experiments show that higher performance is achieved when humans submit commands in low-dimensional task spaces as opposed to joint space manipulations.

ROJul 25, 2016

Scaling Sampling-based Motion Planning to Humanoid Robots

Yiming Yang, Vladimir Ivan, Wolfgang Merkt et al.

Planning balanced and collision-free motion for humanoid robots is non-trivial, especially when they are operated in complex environments, such as reaching targets behind obstacles or through narrow passages. We propose a method that allows us to apply existing sampling--based algorithms to plan trajectories for humanoids by utilizing a customized state space representation, biased sampling strategies, and a steering function based on a robust inverse kinematics solver. Our approach requires no prior offline computation, thus one can easily transfer the work to new robot platforms. We tested the proposed method solving practical reaching tasks on a 38 degrees-of-freedom humanoid robot, NASA Valkyrie, showing that our method is able to generate valid motion plans that can be executed on advanced full-size humanoid robots. We also present a benchmark between different motion planning algorithms evaluated on a variety of reaching motion problems. This allows us to find suitable algorithms for solving humanoid motion planning problems, and to identify the limitations of these algorithms.