CVApr 7, 2022Code
Zero-Shot Category-Level Object Pose EstimationWalter Goodwin, Sagar Vaze, Ioannis Havoutis et al.
Object pose estimation is an important component of most vision pipelines for embodied agents, as well as in 3D vision more generally. In this paper we tackle the problem of estimating the pose of novel object categories in a zero-shot manner. This extends much of the existing literature by removing the need for pose-labelled datasets or category-specific CAD models for training or inference. Specifically, we make the following contributions. First, we formalise the zero-shot, category-level pose estimation problem and frame it in a way that is most applicable to real-world embodied agents. Secondly, we propose a novel method based on semantic correspondences from a self-supervised vision transformer to solve the pose estimation problem. We further re-purpose the recent CO3D dataset to present a controlled and realistic test setting. Finally, we demonstrate that all baselines for our proposed task perform poorly, and show that our method provides a six-fold improvement in average rotation accuracy at 30 degrees. Our code is available at https://github.com/applied-ai-lab/zero-shot-pose.
ROOct 21, 2022
Reaching Through Latent Space: From Joint Statistics to Path Planning in ManipulationChia-Man Hung, Shaohong Zhong, Walter Goodwin et al. · deepmind, oxford
We present a novel approach to path planning for robotic manipulators, in which paths are produced via iterative optimisation in the latent space of a generative model of robot poses. Constraints are incorporated through the use of constraint satisfaction classifiers operating on the same space. Optimisation leverages gradients through our learned models that provide a simple way to combine goal reaching objectives with constraint satisfaction, even in the presence of otherwise non-differentiable constraints. Our models are trained in a task-agnostic manner on randomly sampled robot poses. In baseline comparisons against a number of widely used planners, we achieve commensurate performance in terms of task success, planning time and path length, performing successful path planning with obstacle avoidance on a real 7-DoF robot arm.
ROMay 2, 2022
VAE-Loco: Versatile Quadruped Locomotion by Learning a Disentangled Gait RepresentationAlexander L. Mitchell, Wolfgang Merkt, Mathieu Geisert et al. · deepmind
Quadruped locomotion is rapidly maturing to a degree where robots are able to realise highly dynamic manoeuvres. However, current planners are unable to vary key gait parameters of the in-swing feet midair. In this work we address this limitation and show that it is pivotal in increasing controller robustness by learning a latent space capturing the key stance phases constituting a particular gait. This is achieved via a generative model trained on a single trot style, which encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. We demonstrate that specific properties of the drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. Due to the nature of our approach these synthesised gaits are continuously variable online during robot operation. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on two versions of the real ANYmal quadruped robots and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations.
ROMar 6, 2023
Leveraging Scene Embeddings for Gradient-Based Motion Planning in Latent SpaceJun Yamada, Chia-Man Hung, Jack Collins et al. · oxford
Motion planning framed as optimisation in structured latent spaces has recently emerged as competitive with traditional methods in terms of planning success while significantly outperforming them in terms of computational speed. However, the real-world applicability of recent work in this domain remains limited by the need to express obstacle information directly in state-space, involving simple geometric primitives. In this work we address this challenge by leveraging learned scene embeddings together with a generative model of the robot manipulator to drive the optimisation process. In addition, we introduce an approach for efficient collision checking which directly regularises the optimisation undertaken for planning. Using simulated as well as real-world experiments, we demonstrate that our approach, AMP-LS, is able to successfully plan in novel, complex scenes while outperforming traditional planning baselines in terms of computation speed by an order of magnitude. We show that the resulting system is fast enough to enable closed-loop planning in real-world dynamic scenes.
ROMar 14, 2022
Agile Maneuvers in Legged Robots: a Predictive Control ApproachCarlos Mastalli, Wolfgang Merkt, Guiyang Xin et al.
Planning and execution of agile locomotion maneuvers have been a longstanding challenge in legged robotics. It requires to derive motion plans and local feedback policies in real-time to handle the nonholonomy of the kinetic momenta. To achieve so, we propose a hybrid predictive controller that considers the robot's actuation limits and full-body dynamics. It combines the feedback policies with tactile information to locally predict future actions. It converges within a few milliseconds thanks to a feasibility-driven approach. Our predictive controller enables ANYmal robots to generate agile maneuvers in realistic scenarios. A crucial element is to track the local feedback policies as, in contrast to whole-body control, they achieve the desired angular momentum. To the best of our knowledge, our predictive controller is the first to handle actuation limits, generate agile locomotion maneuvers, and execute optimal feedback policies for low level torque control without the use of a separate whole-body controller.
ROFeb 17Code
ODYN: An All-Shifted Non-Interior-Point Method for Quadratic Programming in Robotics and AIJose Rojas, Aristotelis Papatheodorou, Sergi Martinez et al.
We introduce ODYN, a novel all-shifted primal-dual non-interior-point quadratic programming (QP) solver designed to efficiently handle challenging dense and sparse QPs. ODYN combines all-shifted nonlinear complementarity problem (NCP) functions with proximal method of multipliers to robustly address ill-conditioned and degenerate problems, without requiring linear independence of the constraints. It exhibits strong warm-start performance and is well suited to both general-purpose optimization, and robotics and AI applications, including model-based control, estimation, and kernel-based learning methods. We provide an open-source implementation and benchmark ODYN on the Maros-Mészáros test set, demonstrating state-of-the-art convergence performance in small-to-high-scale problems. The results highlight ODYN's superior warm-starting capabilities, which are critical in sequential and real-time settings common in robotics and AI. These advantages are further demonstrated by deploying ODYN as the backend of an SQP-based predictive control framework (OdynSQP), as the implicitly differentiable optimization layer for deep learning (ODYNLayer), and the optimizer of a contact-dynamics simulation (ODYNSim).
ROSep 26, 2022
Learning and Deploying Robust Locomotion Policies with Minimal Dynamics RandomizationLuigi Campanaro, Siddhant Gangapurwala, Wolfgang Merkt et al.
Training deep reinforcement learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behaviour. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, exhaustively engineered approaches such as system identification, dynamics randomization, and domain adaptation are generally employed. As an alternative, we investigate a simple strategy of random force injection (RFI) to perturb system dynamics during training. We show that the application of random forces enables us to emulate dynamics randomization. This allows us to obtain locomotion policies that are robust to variations in system dynamics. We further extend RFI, referred to as extended random force injection (ERFI), by introducing an episodic actuation offset. We demonstrate that ERFI provides additional robustness for variations in system mass offering on average a 53% improved performance over RFI. We also show that ERFI is sufficient to perform a successful sim-to-real transfer on two different quadrupedal platforms, ANYmal C and Unitree A1, even for perceptive locomotion over uneven terrain in outdoor environments.
ROSep 29, 2022
Learning Low-Frequency Motion Control for Robust and Dynamic Robot LocomotionSiddhant Gangapurwala, Luigi Campanaro, Ioannis Havoutis
Robotic locomotion is often approached with the goal of maximizing robustness and reactivity by increasing motion control frequency. We challenge this intuitive notion by demonstrating robust and dynamic locomotion with a learned motion controller executing at as low as 8 Hz on a real ANYmal C quadruped. The robot is able to robustly and repeatably achieve a high heading velocity of 1.5 m/s, traverse uneven terrain, and resist unexpected external perturbations. We further present a comparative analysis of deep reinforcement learning (RL) based motion control policies trained and executed at frequencies ranging from 5 Hz to 200 Hz. We show that low-frequency policies are less sensitive to actuation latencies and variations in system dynamics. This is to the extent that a successful sim-to-real transfer can be performed even without any dynamics randomization or actuation modeling. We support this claim through a set of rigorous empirical evaluations. Moreover, to assist reproducibility, we provide the training and deployment code along with an extended analysis at https://ori-drs.github.io/lfmc/.
ROApr 25, 2023
Roll-Drop: accounting for observation noise with a single parameterLuigi Campanaro, Daniele De Martini, Siddhant Gangapurwala et al.
This paper proposes a simple strategy for sim-to-real in Deep-Reinforcement Learning (DRL) -- called Roll-Drop -- that uses dropout during simulation to account for observation noise during deployment without explicitly modelling its distribution for each state. DRL is a promising approach to control robots for highly dynamic and feedback-based manoeuvres, and accurate simulators are crucial to providing cheap and abundant data to learn the desired behaviour. Nevertheless, the simulated data are noiseless and generally show a distributional shift that challenges the deployment on real machines where sensor readings are affected by noise. The standard solution is modelling the latter and injecting it during training; while this requires a thorough system identification, Roll-Drop enhances the robustness to sensor noise by tuning only a single parameter. We demonstrate an 80% success rate when up to 25% noise is injected in the observations, with twice higher robustness than the baselines. We deploy the controller trained in simulation on a Unitree A1 platform and assess this improved robustness on the physical system.
LGJan 30
PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RLJacques Cloete, Mathias Jackermeier, Ioannis Havoutis et al.
A central challenge in multi-task reinforcement learning (RL) is to train generalist policies capable of performing tasks not seen during training. To facilitate such generalization, linear temporal logic (LTL) has recently emerged as a powerful formalism for specifying structured, temporally extended tasks to RL agents. While existing approaches to LTL-guided multi-task RL demonstrate successful generalization across LTL specifications, they are unable to generalize to unseen vocabularies of propositions (or "symbols"), which describe high-level events in LTL. We present PlatoLTL, a novel approach that enables policies to zero-shot generalize not only compositionally across LTL formula structures, but also parametrically across propositions. We achieve this by treating propositions as instances of parameterized predicates rather than discrete symbols, allowing policies to learn shared structure across related propositions. We propose a novel architecture that embeds and composes predicates to represent LTL specifications, and demonstrate successful zero-shot generalization to novel propositions and tasks across challenging environments.
ROMay 14, 2025
Neural Associative Skill Memories for safer robotics and modelling human sensorimotor repertoiresPranav Mahajan, Mufeng Tang, T. Ed Li et al.
Modern robots face challenges shared by humans, where machines must learn multiple sensorimotor skills and express them adaptively. Equipping robots with a human-like memory of how it feels to do multiple stereotypical movements can make robots more aware of normal operational states and help develop self-preserving safer robots. Associative Skill Memories (ASMs) aim to address this by linking movement primitives to sensory feedback, but existing implementations rely on hard-coded libraries of individual skills. A key unresolved problem is how a single neural network can learn a repertoire of skills while enabling fault detection and context-aware execution. Here we introduce Neural Associative Skill Memories (ASMs), a framework that utilises self-supervised predictive coding for temporal prediction to unify skill learning and expression, using biologically plausible learning rules. Unlike traditional ASMs which require explicit skill selection, Neural ASMs implicitly recognize and express skills through contextual inference, enabling fault detection across learned behaviours without an explicit skill selection mechanism. Compared to recurrent neural networks trained via backpropagation through time, our model achieves comparable qualitative performance in skill memory expression while using local learning rules and predicts a biologically relevant speed-accuracy trade-off during skill memory expression. This work advances the field of neurorobotics by demonstrating how predictive coding principles can model adaptive robot control and human motor preparation. By unifying fault detection, reactive control, skill memorisation and expression into a single energy-based architecture, Neural ASMs contribute to safer robotics and provide a computational lens to study biological sensorimotor learning.
RODec 22, 2025
Vision-Language-Policy Model for Dynamic Robot Task PlanningJin Wang, Kim Tien Ly, Jacques Cloete et al.
Bridging the gap between natural language commands and autonomous execution in unstructured environments remains an open challenge for robotics. This requires robots to perceive and reason over the current task scene through multiple modalities, and to plan their behaviors to achieve their intended goals. Traditional robotic task-planning approaches often struggle to bridge low-level execution with high-level task reasoning, and cannot dynamically update task strategies when instructions change during execution, which ultimately limits their versatility and adaptability to new tasks. In this work, we propose a novel language model-based framework for dynamic robot task planning. Our Vision-Language-Policy (VLP) model, based on a vision-language model fine-tuned on real-world data, can interpret semantic instructions and integrate reasoning over the current task scene to generate behavior policies that control the robot to accomplish the task. Moreover, it can dynamically adjust the task strategy in response to changes in the task, enabling flexible adaptation to evolving task requirements. Experiments conducted with different robots and a variety of real-world tasks show that the trained model can efficiently adapt to novel scenarios and dynamically update its policy, demonstrating strong planning autonomy and cross-embodiment generalization. Videos: https://robovlp.github.io/
RONov 13, 2024
Offline Adaptation of Quadruped Locomotion using Diffusion ModelsReece O'Mahoney, Alexander L. Mitchell, Wanming Yu et al.
We present a diffusion-based approach to quadrupedal locomotion that simultaneously addresses the limitations of learning and interpolating between multiple skills and of (modes) offline adapting to new locomotion behaviours after training. This is the first framework to apply classifier-free guided diffusion to quadruped locomotion and demonstrate its efficacy by extracting goal-conditioned behaviour from an originally unlabelled dataset. We show that these capabilities are compatible with a multi-skill policy and can be applied with little modification and minimal compute overhead, i.e., running entirely on the robots onboard CPU. We verify the validity of our approach with hardware experiments on the ANYmal quadruped platform.
LGFeb 23, 2025
MetaSym: A Symplectic Meta-learning Framework for Physical IntelligencePranav Vaidhyanathan, Aristotelis Papatheodorou, Mark T. Mitchison et al.
Scalable and generalizable physics-aware deep learning has long been considered a significant challenge with various applications across diverse domains ranging from robotics to molecular dynamics. Central to almost all physical systems are symplectic forms, the geometric backbone that underpins fundamental invariants like energy and momentum. In this work, we introduce a novel deep learning framework, MetaSym. In particular, MetaSym combines a strong symplectic inductive bias obtained from a symplectic encoder, and an autoregressive decoder with meta-attention. This principled design ensures that core physical invariants remain intact, while allowing flexible, data-efficient adaptation to system heterogeneities. We benchmark MetaSym with highly varied and realistic datasets, such as a high-dimensional spring-mesh system (Otness et al., 2021), an open quantum system with dissipation and measurement backaction, and robotics-inspired quadrotor dynamics. Our results demonstrate superior performance in modeling dynamics under few-shot adaptation, outperforming state-of-the-art baselines that use larger models.
ROJun 23, 2025
Learning Physical Systems: Symplectification via Gauge Fixing in Dirac StructuresAristotelis Papatheodorou, Pranav Vaidhyanathan, Natalia Ares et al.
Physics-informed deep learning has achieved remarkable progress by embedding geometric priors, such as Hamiltonian symmetries and variational principles, into neural networks, enabling structure-preserving models that extrapolate with high accuracy. However, in systems with dissipation and holonomic constraints, ubiquitous in legged locomotion and multibody robotics, the canonical symplectic form becomes degenerate, undermining the very invariants that guarantee stability and long-term prediction. In this work, we tackle this foundational limitation by introducing Presymplectification Networks (PSNs), the first framework to learn the symplectification lift via Dirac structures, restoring a non-degenerate symplectic geometry by embedding constrained systems into a higher-dimensional manifold. Our architecture combines a recurrent encoder with a flow-matching objective to learn the augmented phase-space dynamics end-to-end. We then attach a lightweight Symplectic Network (SympNet) to forecast constrained trajectories while preserving energy, momentum, and constraint satisfaction. We demonstrate our method on the dynamics of the ANYmal quadruped robot, a challenging contact-rich, multibody system. To the best of our knowledge, this is the first framework that effectively bridges the gap between constrained, dissipative mechanical systems and symplectic learning, unlocking a whole new class of geometric machine learning models, grounded in first principles yet adaptable from data.
ROMay 12, 2025
Improving Trajectory Stitching with Flow ModelsReece O'Mahoney, Wanming Yu, Ioannis Havoutis
Generative models have shown great promise as trajectory planners, given their affinity to modeling complex distributions and guidable inference process. Previous works have successfully applied these in the context of robotic manipulation but perform poorly when the required solution does not exist as a complete trajectory within the training set. We identify that this is a result of being unable to plan via stitching, and subsequently address the architectural and dataset choices needed to remedy this. On top of this, we propose a novel addition to the training and inference procedures to both stabilize and enhance these capabilities. We demonstrate the efficacy of our approach by generating plans with out of distribution boundary conditions and performing obstacle avoidance on the Franka Panda in simulation and on real hardware. In both of these tasks our method performs significantly better than the baselines and is able to avoid obstacles up to four times as large.
ROJun 20, 2024
Adaptive Manipulation using Behavior TreesJacques Cloete, Wolfgang Merkt, Ioannis Havoutis
Many manipulation tasks pose a challenge since they depend on non-visual environmental information that can only be determined after sustained physical interaction has already begun. This is particularly relevant for effort-sensitive, dynamics-dependent tasks such as tightening a valve. To perform these tasks safely and reliably, robots must be able to quickly adapt in response to unexpected changes during task execution, and should also learn from past experience to better inform future decisions. Humans can intuitively respond and adapt their manipulation strategy to suit such problems, but representing and implementing such behaviors for robots remains a challenge. In this work we show how this can be achieved within the framework of behavior trees. We present the adaptive behavior tree, a scalable and generalizable behavior tree design that enables a robot to quickly adapt to and learn from both visual and non-visual observations during task execution, preempting task failure or switching to a different manipulation strategy. The adaptive behavior tree selects the manipulation strategy that is predicted to optimize task performance, and learns from past experience to improve these predictions for future attempts. We test our approach on a variety of tasks commonly found in industry; the adaptive behavior tree demonstrates safety, robustness (100% success rate) and efficiency in task completion (up to 36% task speedup from the baseline).
ROMay 22, 2023
You Only Look at One: Category-Level Object Representations for Pose Estimation From a Single ExampleWalter Goodwin, Ioannis Havoutis, Ingmar Posner
In order to meaningfully interact with the world, robot manipulators must be able to interpret objects they encounter. A critical aspect of this interpretation is pose estimation: inferring quantities that describe the position and orientation of an object in 3D space. Most existing approaches to pose estimation make limiting assumptions, often working only for specific, known object instances, or at best generalising to an object category using large pose-labelled datasets. In this work, we present a method for achieving category-level pose estimation by inspection of just a single object from a desired category. We show that we can subsequently perform accurate pose estimation for unseen objects from an inspected category, and considerably outperform prior work by exploiting multi-view correspondences. We demonstrate that our method runs in real-time, enabling a robot manipulator equipped with an RGBD sensor to perform online 6D pose estimation for novel objects. Finally, we showcase our method in a continual learning setting, with a robot able to determine whether objects belong to known categories, and if not, use active perception to produce a one-shot category representation for subsequent pose estimation.
ROJan 19, 2022
BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion PlanningAvadesh Meduri, Paarth Shah, Julian Viereck et al.
Online planning of whole-body motions for legged robots is challenging due to the inherent nonlinearity in the robot dynamics. In this work, we propose a nonlinear MPC framework, the BiConMP which can generate whole body trajectories online by efficiently exploiting the structure of the robot dynamics. BiConMP is used to generate various cyclic gaits on a real quadruped robot and its performance is evaluated on different terrain, countering unforeseen pushes and transitioning online between different gaits. Further, the ability of BiConMP to generate non-trivial acyclic whole-body dynamic motions on the robot is presented. The same approach is also used to generate various dynamic motions in MPC on a humanoid robot (Talos) and another quadruped robot (AnYmal) in simulation. Finally, an extensive empirical analysis on the effects of planning horizon and frequency on the nonlinear MPC framework is reported and discussed.
ROJan 13, 2022
Motion Planning in Dynamic Environments Using Context-Aware Human Trajectory PredictionMark Nicholas Finean, Luka Petrović, Wolfgang Merkt et al.
Over the years, the separate fields of motion planning, mapping, and human trajectory prediction have advanced considerably. However, the literature is still sparse in providing practical frameworks that enable mobile manipulators to perform whole-body movements and account for the predicted motion of moving obstacles. Previous optimisation-based motion planning approaches that use distance fields have suffered from the high computational cost required to update the environment representation. We demonstrate that GPU-accelerated predicted composite distance fields significantly reduce the computation time compared to calculating distance fields from scratch. We integrate this technique with a complete motion planning and perception framework that accounts for the predicted motion of humans in dynamic environments, enabling reactive and pre-emptive motion planning that incorporates predicted motions. To achieve this, we propose and implement a novel human trajectory prediction method that combines intention recognition with trajectory optimisation-based motion planning. We validate our resultant framework on a real-world Toyota Human Support Robot (HSR) using live RGB-D sensor data from the onboard camera. In addition to providing analysis on a publicly available dataset, we release the Oxford Indoor Human Motion (Oxford-IHM) dataset and demonstrate state-of-the-art performance in human trajectory prediction. The Oxford-IHM dataset is a human trajectory prediction dataset in which people walk between regions of interest in an indoor environment. Both static and robot-mounted RGB-D cameras observe the people while tracked with a motion-capture system.
RODec 9, 2021
Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped LocomotionAlexander L. Mitchell, Wolfgang Merkt, Mathieu Geisert et al.
Quadruped locomotion is rapidly maturing to a degree where robots now routinely traverse a variety of unstructured terrains. However, while gaits can be varied typically by selecting from a range of pre-computed styles, current planners are unable to vary key gait parameters continuously while the robot is in motion. The synthesis, on-the-fly, of gaits with unexpected operational characteristics or even the blending of dynamic manoeuvres lies beyond the capabilities of the current state-of-the-art. In this work we address this limitation by learning a latent space capturing the key stance phases of a particular gait, via a generative model trained on a single trot style. This encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. In fact properties of this drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on a real ANYmal quadruped robot and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations.
RONov 15, 2021
Semantically Grounded Object Matching for Robust Robotic Scene RearrangementWalter Goodwin, Sagar Vaze, Ioannis Havoutis et al.
Object rearrangement has recently emerged as a key competency in robot manipulation, with practical solutions generally involving object detection, recognition, grasping and high-level planning. Goal-images describing a desired scene configuration are a promising and increasingly used mode of instruction. A key outstanding challenge is the accurate inference of matches between objects in front of a robot, and those seen in a provided goal image, where recent works have struggled in the absence of object-specific training data. In this work, we explore the deterioration of existing methods' ability to infer matches between objects as the visual shift between observed and goal scenes increases. We find that a fundamental limitation of the current setting is that source and target images must contain the same $\textit{instance}$ of every object, which restricts practical deployment. We present a novel approach to object matching that uses a large pre-trained vision-language model to match objects in a cross-instance setting by leveraging semantics together with visual features as a more robust, and much more general, measure of similarity. We demonstrate that this provides considerably improved matching performance in cross-instance settings, and can be used to guide multi-object rearrangement with a robot manipulator from an image that shares no object $\textit{instances}$ with the robot's scene.
ROSep 10, 2021
Where Should I Look? Optimised Gaze Control for Whole-Body Collision Avoidance in Dynamic EnvironmentsMark Nicholas Finean, Wolfgang Merkt, Ioannis Havoutis
As robots operate in increasingly complex and dynamic environments, fast motion re-planning has become a widely explored area of research. In a real-world deployment, we often lack the ability to fully observe the environment at all times, giving rise to the challenge of determining how to best perceive the environment given a continuously updated motion plan. We provide the first investigation into a `smart' controller for gaze control with the objective of providing effective perception of the environment for obstacle avoidance and motion planning in dynamic and unknown environments. We detail the novel problem of determining the best head camera behaviour for mobile robots when constrained by a trajectory. Furthermore, we propose a greedy optimisation-based solution that uses a combination of voxelised rewards and motion primitives. We demonstrate that our method outperforms the benchmark methods in 2D and 3D environments, in respect of both the ability to explore the local surroundings, as well as in a superior success rate of finding collision-free trajectories -- our method is shown to provide 7.4x better map exploration while consistently achieving a higher success rate for generating collision-free trajectories. We verify our findings on a physical Toyota Human Support Robot (HSR) using a GPU-accelerated perception framework.
ROAug 4, 2021
Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate DescentPaarth Shah, Avadesh Meduri, Wolfgang Merkt et al.
In this paper we explore the use of block coordinate descent (BCD) to optimize the centroidal momentum dynamics for dynamically consistent multi-contact behaviors. The centroidal dynamics have recently received a large amount of attention in order to create physically realizable motions for robots with hands and feet while being computationally more tractable than full rigid body dynamics models. Our contribution lies in exploiting the structure of the dynamics in order to simplify the original non-convex problem into two convex subproblems. We iterate between these two subproblems for a set number of iterations or until a consensus is reached. We explore the properties of the proposed optimization method for the centroidal dynamics and verify in simulation that motions generated by our approach can be tracked by the quadruped Solo12. In addition, we compare our method to a recently proposed convexification using a sequence of convex relaxations as well as a more standard interior point method used in the off- the-shelf solver IPOPT to show that our approach finds similar, if not better, trajectories (in terms of cost), and is more than four times faster than both approaches. Finally, compared to previous approaches, we note its practicality due to the convex nature of each subproblem which allows our method to be used with any off-the-shelf quadratic programming solver.
ROJun 20, 2021
HapFIC: An Adaptive Force/Position Controller for Safe Environment Interaction in Articulated SystemsCarlo Tiseo, Wolfgang Merkt, Keyhan Kouhkiloui Babarahmati et al.
Haptic interaction is essential for the dynamic dexterity of animals, which seamlessly switch from an impedance to an admittance behaviour using the force feedback from their proprioception. However, this ability is extremely challenging to reproduce in robots, especially when dealing with complex interaction dynamics, distributed contacts, and contact switching. Current model-based controllers require accurate interaction modelling to account for contacts and stabilise the interaction. In this manuscript, we propose an adaptive force/position controller that exploits the fractal impedance controller's passivity and non-linearity to execute a finite search algorithm using the force feedback signal from the sensor at the end-effector. The method is computationally inexpensive, opening the possibility to deal with distributed contacts in the future. We evaluated the architecture in physics simulation and showed that the controller can robustly control the interaction with objects of different dynamics without violating the maximum allowable target forces or causing numerical instability even for very rigid objects. The proposed controller can also autonomously deal with contact switching and may find application in multiple fields such as legged locomotion, rehabilitation and assistive robotics.
ROApr 19, 2021
Receding-Horizon Perceptive Trajectory Optimization for Dynamic Legged Locomotion with Learned InitializationOliwier Melon, Romeo Orsolino, David Surovik et al.
To dynamically traverse challenging terrain, legged robots need to continually perceive and reason about upcoming features, adjust the locations and timings of future footfalls and leverage momentum strategically. We present a pipeline that enables flexibly-parametrized trajectories for perceptive and dynamic quadruped locomotion to be optimized in an online, receding-horizon manner. The initial guess passed to the optimizer affects the computation needed to achieve convergence and the quality of the solution. We consider two methods for generating good guesses. The first is a heuristic initializer which provides a simple guess and requires significant optimization but is nonetheless suitable for adaptation to upcoming terrain. We demonstrate experiments using the ANYmal C quadruped, with fully onboard sensing and computation, to cross obstacles at moderate speeds using this technique. Our second approach uses latent-mode trajectory regression (LMTR) to imitate expert data - while avoiding invalid interpolations between distinct behaviors - such that minimal optimization is needed. This enables high-speed motions that make more expansive use of the robot's capabilities. We demonstrate it on flat ground with the real robot and provide numerical trials that progress toward deployment on terrain. These results illustrate a paradigm for advancing beyond short-horizon dynamic reactions, toward the type of intuitive and adaptive locomotion planning exhibited by animals and humans.
ROMar 22, 2021
Introspective Visuomotor Control: Exploiting Uncertainty in Deep Visuomotor Control for Failure RecoveryChia-Man Hung, Li Sun, Yizhe Wu et al.
End-to-end visuomotor control is emerging as a compelling solution for robot manipulation tasks. However, imitation learning-based visuomotor control approaches tend to suffer from a common limitation, lacking the ability to recover from an out-of-distribution state caused by compounding errors. In this paper, instead of using tactile feedback or explicitly detecting the failure through vision, we investigate using the uncertainty of a policy neural network. We propose a novel uncertainty-based approach to detect and recover from failure cases. Our hypothesis is that policy uncertainties can implicitly indicate the potential failures in the visuomotor control task and that robot states with minimum uncertainty are more likely to lead to task success. To recover from high uncertainty cases, the robot monitors its uncertainty along a trajectory and explores possible actions in the state-action space to bring itself to a more certain state. Our experiments verify this hypothesis and show a significant improvement on task success rate: 12% in pushing, 15% in pick-and-reach and 22% in pick-and-place.
ROMar 5, 2021
Simultaneous Scene Reconstruction and Whole-Body Motion Planning for Safe Operation in Dynamic EnvironmentsMark Nicholas Finean, Wolfgang Merkt, Ioannis Havoutis
Recent work has demonstrated real-time mapping and reconstruction from dense perception, while motion planning based on distance fields has been shown to achieve fast, collision-free motion synthesis with good convergence properties. However, demonstration of a fully integrated system that can safely re-plan in unknown environments, in the presence of static and dynamic obstacles, has remained an open challenge. In this work, we first study the impact that signed and unsigned distance fields have on optimisation convergence, and the resultant error cost in trajectory optimisation problems in 2D path planning, arm manipulator motion planning, and whole-body loco-manipulation planning. We further analyse the performance of three state-of-the-art approaches to generating distance fields (Voxblox, Fiesta, and GPU-Voxels) for use in real-time environment reconstruction. Finally, we use our findings to construct a practical hybrid mapping and motion planning system which uses GPU-Voxels and GPMP2 to perform receding-horizon whole-body motion planning that can smoothly avoid moving obstacles in 3D space using live sensor data. Our results are validated in simulation and on a real-world Toyota Human Support Robot (HSR).
ROFeb 25, 2021
CPG-ACTOR: Reinforcement Learning for Central Pattern GeneratorsLuigi Campanaro, Siddhant Gangapurwala, Daniele De Martini et al.
Central Pattern Generators (CPGs) have several properties desirable for locomotion: they generate smooth trajectories, are robust to perturbations and are simple to implement. Although conceptually promising, we argue that the full potential of CPGs has so far been limited by insufficient sensory-feedback information. This paper proposes a new methodology that allows tuning CPG controllers through gradient-based optimization in a Reinforcement Learning (RL) setting. To the best of our knowledge, this is the first time CPGs have been trained in conjunction with a MultilayerPerceptron (MLP) network in a Deep-RL context. In particular, we show how CPGs can directly be integrated as the Actor in an Actor-Critic formulation. Additionally, we demonstrate how this change permits us to integrate highly non-linear feedback directly from sensory perception to reshape the oscillators' dynamics. Our results on a locomotion task using a single-leg hopper demonstrate that explicitly using the CPG as the Actor rather than as part of the environment results in a significant increase in the reward gained over time (6x more) compared with previous approaches. Furthermore, we show that our method without feedback reproduces results similar to prior work with feedback. Finally, we demonstrate how our closed-loop CPG progressively improves the hopping behaviour for longer training epochs relying only on basic reward functions.
RODec 5, 2020
RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal ControlSiddhant Gangapurwala, Mathieu Geisert, Romeo Orsolino et al.
We present a unified model-based and data-driven approach for quadrupedal planning and control to achieve dynamic locomotion over uneven terrain. We utilize on-board proprioceptive and exteroceptive feedback to map sensory information and desired base velocity commands into footstep plans using a reinforcement learning (RL) policy. This RL policy is trained in simulation over a wide range of procedurally generated terrains. When ran online, the system tracks the generated footstep plans using a model-based motion controller. We evaluate the robustness of our method over a wide variety of complex terrains. It exhibits behaviors which prioritize stability over aggressive locomotion. Additionally, we introduce two ancillary RL policies for corrective whole-body motion tracking and recovery control. These policies account for changes in physical parameters and external perturbations. We train and evaluate our framework on a complex quadrupedal system, ANYmal version B, and demonstrate transferability to a larger and heavier robot, ANYmal C, without requiring retraining.
RONov 14, 2020
Sparsity-Inducing Optimal Control via Differential Dynamic ProgrammingTraiko Dinev, Wolfgang Merkt, Vladimir Ivan et al.
Optimal control is a popular approach to synthesize highly dynamic motion. Commonly, $L_2$ regularization is used on the control inputs in order to minimize energy used and to ensure smoothness of the control inputs. However, for some systems, such as satellites, the control needs to be applied in sparse bursts due to how the propulsion system operates. In this paper, we study approaches to induce sparsity in optimal control solutions -- namely via smooth $L_1$ and Huber regularization penalties. We apply these loss terms to state-of-the-art DDP-based solvers to create a family of sparsity-inducing optimal control methods. We analyze and compare the effect of the different losses on inducing sparsity, their numerical conditioning, their impact on convergence, and discuss hyperparameter settings. We demonstrate our method in simulation and hardware experiments on canonical dynamics systems, control of satellites, and the NASA Valkyrie humanoid robot. We provide an implementation of our method and all examples for reproducibility on GitHub.
RONov 1, 2020
A Passive Navigation Planning Algorithm for Collision-free Control of Mobile RobotsCarlo Tiseo, Vladimir Ivan, Wolfgang Merkt et al.
Path planning and collision avoidance are challenging in complex and highly variable environments due to the limited horizon of events. In literature, there are multiple model- and learning-based approaches that require significant computational resources to be effectively deployed and they may have limited generality. We propose a planning algorithm based on a globally stable passive controller that can plan smooth trajectories using limited computational resources in challenging environmental conditions. The architecture combines the recently proposed fractal impedance controller with elastic bands and regions of finite time invariance. As the method is based on an impedance controller, it can also be used directly as a force/torque controller. We validated our method in simulation to analyse the ability of interactive navigation in challenging concave domains via the issuing of via-points, and its robustness to low bandwidth feedback. A swarm simulation using 11 agents validated the scalability of the proposed method. We have performed hardware experiments on a holonomic wheeled platform validating smoothness and robustness of interaction with dynamic agents (i.e., humans and robots). The computational complexity of the proposed local planner enables deployment with low-power micro-controllers lowering the energy consumption compared to other methods that rely upon numeric optimisation.
ROOct 11, 2020
Inverse Dynamics vs. Forward Dynamics in Direct Transcription Formulations for Trajectory OptimizationHenrique Ferrolho, Vladimir Ivan, Wolfgang Merkt et al.
Benchmarks of state-of-the-art rigid-body dynamics libraries report better performance solving the inverse dynamics problem than the forward alternative. Those benchmarks encouraged us to question whether that computational advantage would translate to direct transcription, where calculating rigid-body dynamics and their derivatives accounts for a significant share of computation time. In this work, we implement an optimization framework where both approaches for enforcing the system dynamics are available. We evaluate the performance of each approach for systems of varying complexity, for domains with rigid contacts. Our tests reveal that formulations using inverse dynamics converge faster, require less iterations, and are more robust to coarse problem discretization. These results indicate that inverse dynamics should be preferred to enforce the nonlinear system dynamics in simultaneous methods, such as direct transcription.
ROOct 2, 2020
Memory Clustering using Persistent Homology for Multimodality- and Discontinuity-Sensitive Learning of Optimal Control Warm-startsWolfgang Merkt, Vladimir Ivan, Traiko Dinev et al.
Shooting methods are an efficient approach to solving nonlinear optimal control problems. As they use local optimization, they exhibit favorable convergence when initialized with a good warm-start but may not converge at all if provided with a poor initial guess. Recent work has focused on providing an initial guess from a learned model trained on samples generated during an offline exploration of the problem space. However, in practice the solutions contain discontinuities introduced by system dynamics or the environment. Additionally, in many cases multiple equally suitable, i.e., multi-modal, solutions exist to solve a problem. Classic learning approaches smooth across the boundary of these discontinuities and thus generalize poorly. In this work, we apply tools from algebraic topology to extract information on the underlying structure of the solution space. In particular, we introduce a method based on persistent homology to automatically cluster the dataset of precomputed solutions to obtain different candidate initial guesses. We then train a Mixture-of-Experts within each cluster to predict state and control trajectories to warm-start the optimal control solver and provide a comparison with modality-agnostic learning. We demonstrate our method on a cart-pole toy problem and a quadrotor avoiding obstacles, and show that clustering samples based on inherent structure improves the warm-start quality.
ROAug 3, 2020
Predicted Composite Signed-Distance Fields for Real-Time Motion Planning in Dynamic EnvironmentsMark Nicholas Finean, Wolfgang Merkt, Ioannis Havoutis
We present a novel framework for motion planning in dynamic environments that accounts for the predicted trajectories of moving objects in the scene. We explore the use of composite signed-distance fields in motion planning and detail how they can be used to generate signed-distance fields (SDFs) in real-time to incorporate predicted obstacle motions. We benchmark our approach of using composite SDFs against performing exact SDF calculations on the workspace occupancy grid. Our proposed technique generates predictions substantially faster and typically exhibits an 81--97% reduction in time for subsequent predictions. We integrate our framework with GPMP2 to demonstrate a full implementation of our approach in real-time, enabling a 7-DoF Panda arm to smoothly avoid a moving robot.
ROJul 3, 2020
First Steps: Latent-Space Control with Semantic Constraints for Quadruped LocomotionAlexander L. Mitchell, Martin Engelcke, Oiwi Parker Jones et al.
Traditional approaches to quadruped control frequently employ simplified, hand-derived models. This significantly reduces the capability of the robot since its effective kinematic range is curtailed. In addition, kinodynamic constraints are often non-differentiable and difficult to implement in an optimisation approach. In this work, these challenges are addressed by framing quadruped control as optimisation in a structured latent space. A deep generative model captures a statistical representation of feasible joint configurations, whilst complex dynamic and terminal constraints are expressed via high-level, semantic indicators and represented by learned classifiers operating upon the latent space. As a consequence, complex constraints are rendered differentiable and evaluated an order of magnitude faster than analytical approaches. We validate the feasibility of locomotion trajectories optimised using our approach both in simulation and on a real-world ANYmal quadruped. Our results demonstrate that this approach is capable of generating smooth and realisable trajectories. To the best of our knowledge, this is the first time latent space control has been successfully applied to a complex, real robot platform.
ROMar 11, 2020
Motion Planning for Quadrupedal Locomotion: Coupled Planning, Terrain Mapping and Whole-Body ControlCarlos Mastalli, Ioannis Havoutis, Michele Focchi et al.
Planning whole-body motions while taking into account the terrain conditions is a challenging problem for legged robots since the terrain model might produce many local minima. Our coupled planning method uses stochastic and derivatives-free search to plan both foothold locations and horizontal motions due to the local minima produced by the terrain model. It jointly optimizes body motion, step duration and foothold selection, and it models the terrain as a cost-map. Due to the novel attitude planning method, the horizontal motion plans can be applied to various terrain conditions. The attitude planner ensures the robot stability by imposing limits to the angular acceleration. Our whole-body controller tracks compliantly trunk motions while avoiding slippage, as well as kinematic and torque limits. Despite the use of a simplified model, which is restricted to flat terrain, our approach shows remarkable capability to deal with a wide range of non-coplanar terrains. The results are validated by experimental trials and comparative evaluations in a series of terrains of progressively increasing complexity.
ROFeb 22, 2020
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot LocomotionSiddhant Gangapurwala, Alexander Mitchell, Ioannis Havoutis
Deep reinforcement learning (RL) uses model-free techniques to optimize task-specific control policies. Despite having emerged as a promising approach for complex problems, RL is still hard to use reliably for real-world applications. Apart from challenges such as precise reward function tuning, inaccurate sensing and actuation, and non-deterministic response, existing RL methods do not guarantee behavior within required safety constraints that are crucial for real robot scenarios. In this regard, we introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained proximal policy optimization (CPPO) for tracking base velocity commands while following the defined constraints. We also introduce schemes which encourage state recovery into constrained regions in case of constraint violations. We present experimental results of our training method and test it on the real ANYmal quadruped robot. We compare our approach against the unconstrained RL method and show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
ROFeb 17, 2020
Reliable Trajectories for Dynamic Quadrupeds using Analytical Costs and Learned InitializationsOliwier Melon, Mathieu Geisert, David Surovik et al.
Dynamic traversal of uneven terrain is a major objective in the field of legged robotics. The most recent model predictive control approaches for these systems can generate robust dynamic motion of short duration; however, planning over a longer time horizon may be necessary when navigating complex terrain. A recently-developed framework, Trajectory Optimization for Walking Robots (TOWR), computes such plans but does not guarantee their reliability on real platforms, under uncertainty and perturbations. We extend TOWR with analytical costs to generate trajectories that a state-of-the-art whole-body tracking controller can successfully execute. To reduce online computation time, we implement a learning-based scheme for initialization of the nonlinear program based on offline experience. The execution of trajectories as long as 16 footsteps and 5.5 s over different terrains by a real quadruped demonstrates the effectiveness of the approach on hardware. This work builds toward an online system which can efficiently and robustly replan dynamic trajectories.
ROApr 17, 2019
Contact Planning for the ANYmal Quadruped Robot using an Acyclic Reachability-Based PlannerMathieu Geisert, Thomas Yates, Asil Orgen et al.
Despite the great progress in quadrupedal robotics during the last decade, selecting good contacts (footholds) in highly uneven and cluttered environments still remains an open challenge. This paper builds upon a state-of-the-art approach, already successfully used for humanoid robots, and applies it to our robotic platform; the quadruped robot ANY-mal. The proposed algorithm decouples the problem into two subprob-lems: first a guide trajectory for the robot is generated, then contacts are created along this trajectory. Both subproblems rely on approximations and heuristics that need to be tuned. The main contribution of this work is to explain how this algorithm has been retuned to work with ANY-mal and to show the relevance of the approach with a variety of tests in realistic dynamic simulations.
ROApr 9, 2019
Hierarchical Planning of Dynamic Movements without Scheduled Contact SequencesCarlos Mastalli, Ioannis Havoutis, Michele Focchi et al.
Most animal and human locomotion behaviors for solving complex tasks involve dynamic motions and rich contact interaction. In fact, complex maneuvers need to consider dynamic movement and contact events at the same time. We present a hierarchical trajectory optimization approach for planning dynamic movements with unscheduled contact sequences. We compute whole-body motions that achieve goals that cannot be reached in a kinematic fashion. First, we find a feasible CoM motion according to the centroidal dynamics of the robot. Then, we refine the solution by applying the robot's full-dynamics model, where the feasible CoM trajectory is used as a warm-start point. To accomplish the unscheduled contact behavior, we use complementarity constraints to describe the contact model, i.e. environment geometry and non-sliding active contacts. Both optimization phases are posed as Mathematical Program with Complementarity Constraints (MPCC). Experimental trials demonstrate the performance of our planning approach in a set of challenging tasks.
ROApr 7, 2019
Planning and Execution of Dynamic Whole-Body Locomotion for a Hydraulic Quadruped on Challenging TerrainAlexander W. Winkler, Carlos Mastalli, Ioannis Havoutis et al.
We present a framework for dynamic quadrupedal locomotion over challenging terrain, where the choice of appropriate footholds is crucial for the success of the behaviour. We build a model of the environment on-line and on-board using an efficient occupancy grid representation. We use Any-time-Repairing A* (ARA*) to search over a tree of possible actions, choose a rough body path and select the locally-best footholds accordingly. We run a n-step lookahead optimization of the body trajectory using a dynamic stability metric, the Zero Moment Point (ZMP), that generates natural dynamic whole-body motions. A combination of floating-base inverse dynamics and virtual model control accurately executes the desired motions on an actively compliant system. Experimental trials show that this framework allows us to traverse terrains at nearly 6 times the speed of our previous work, evaluated over the same set of trials.
ROApr 7, 2019
On-line and on-board planning and perception for quadrupedal locomotionCarlos Mastalli, Ioannis Havoutis, Alexander W. Winkler et al.
We present a legged motion planning approach for quadrupedal locomotion over challenging terrain. We decompose the problem into body action planning and footstep planning. We use a lattice representation together with a set of defined body movement primitives for computing a body action plan. The lattice representation allows us to plan versatile movements that ensure feasibility for every possible plan. To this end, we propose a set of rules that define the footstep search regions and footstep sequence given a body action. We use Anytime Repairing A* (ARA*) search that guarantees bounded suboptimal plans. Our main contribution is a planning approach that generates on-line versatile movements. Experimental trials demonstrate the performance of our planning approach in a set of challenging terrain conditions. The terrain information and plans are computed on-line and on-board.