ROMay 30
A Unified Framework for Probabilistic Dynamic-, Trajectory- and Vision-based Virtual FixturesMaximilian Mühlbauer, Bernhard Weber, Sylvain Calinon et al.
Probabilistic Virtual Fixtures (VFs) enable the adaptive selection of the most suitable haptic feedback for each phase of a task, based on learned or perceived uncertainty. While keeping the human in the loop remains essential, for instance, to ensure high precision, partial automation of certain task phases is critical for productivity. We present a unified framework for probabilistic VFs that seamlessly switches between manual fixtures, semi-automated fixtures (with the human handling precise tasks), and full autonomy. We introduce a novel probabilistic Dynamical System-based VF for coarse guidance, enabling the robot to autonomously complete certain task phases while keeping the human operator in the loop. For tasks requiring precise guidance, we extend probabilistic position-based trajectory fixtures with automation, allowing for seamless human interaction, geometry-awareness and optimal impedance gains. For manual tasks requiring very precise guidance, we also extend visual servoing fixtures with the same geometry-awareness and impedance behavior. We validate our approach on different robots, including an evaluation with expert users, showcasing operation modes, the ease of programming fixtures and lower interaction forces and favorable usability compared to a baseline.
ROMay 12, 2022
Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid ObjectsJunjia Liu, Yiting Chen, Zhipeng Dong et al.
This letter describes an approach to achieve well-known Chinese cooking art stir-fry on a bimanual robot system. Stir-fry requires a sequence of highly dynamic coordinated movements, which is usually difficult to learn for a chef, let alone transfer to robots. In this letter, we define a canonical stir-fry movement, and then propose a decoupled framework for learning this deformable object manipulation from human demonstration. First, the dual arms of the robot are decoupled into different roles (a leader and follower) and learned with classical and neural network-based methods separately, then the bimanual task is transformed into a coordination problem. To obtain general bimanual coordination, we secondly propose a Graph and Transformer based model -- Structured-Transformer, to capture the spatio-temporal relationship between dual-arm movements. Finally, by adding visual feedback of content deformation, our framework can adjust the movements automatically to achieve the desired stir-fry effect. We verify the framework by a simulator and deploy it on a real bimanual Panda robot system. The experimental results validate our framework can realize the bimanual robot stir-fry motion and have the potential to extend to other deformable objects with bimanual coordination.
ROJun 10, 2022
Tensor Train for Global Optimization Problems in RoboticsSuhan Shetty, Teguh Lembono, Tobias Loew et al.
The convergence of many numerical optimization techniques is highly dependent on the initial guess given to the solver. To address this issue, we propose a novel approach that utilizes tensor methods to initialize existing optimization solvers near global optima. Our method does not require access to a database of good solutions. We first transform the cost function, which depends on both task parameters and optimization variables, into a probability density function. Unlike existing approaches, the joint probability distribution of the task parameters and optimization variables is approximated using the Tensor Train model, which enables efficient conditioning and sampling. We treat the task parameters as random variables, and for a given task, we generate samples for decision variables from the conditional distribution to initialize the optimization solver. Our method can produce multiple solutions (when they exist) faster than existing methods. We first evaluate the approach on benchmark functions for numerical optimization that are hard to solve using gradient-based optimization solvers with a naive initialization. The results show that the proposed method can generate samples close to global optima and from multiple modes. We then demonstrate the generality and relevance of our framework to robotics by applying it to inverse kinematics with obstacles and motion planning problems with a 7-DoF manipulator.
ROMay 19
Learn2Decompose: Learning Problem Decomposition for Efficient Sequential Multi-object Manipulation PlanningYan Zhang, Teng Xue, Amirreza Razmjoo et al.
We present an efficient task and motion replanning approach for sequential multi-object manipulation in dynamic environments. Conventional Task And Motion Planning (TAMP) solvers experience an exponential increase in planning time as the planning horizon and number of objects grow, limiting their applicability in real-world scenarios. To address this, we propose learning problem decompositions from demonstrations to accelerate TAMP solvers. Our approach consists of three key components: goal decomposition learning, computational distance learning, and object reduction. Goal decomposition identifies the necessary sequences of states that the system must pass through before reaching the final goal, treating them as subgoal sequences. Computational distance learning predicts the computational complexity between two states, enabling the system to identify the temporally closest subgoal from a disturbed state. Object reduction minimizes the set of active objects considered during replanning, further improving efficiency. We evaluate our approach on three benchmarks, demonstrating its effectiveness in improving replanning efficiency for sequential multi-object manipulation tasks in dynamic environments.
ROJun 22, 2023
SoftGPT: Learn Goal-oriented Soft Object Manipulation Skills by Generative Pre-trained Heterogeneous Graph TransformerJunjia Liu, Zhihao Li, Wanyu Lin et al.
Soft object manipulation tasks in domestic scenes pose a significant challenge for existing robotic skill learning techniques due to their complex dynamics and variable shape characteristics. Since learning new manipulation skills from human demonstration is an effective way for robot applications, developing prior knowledge of the representation and dynamics of soft objects is necessary. In this regard, we propose a pre-trained soft object manipulation skill learning model, namely SoftGPT, that is trained using large amounts of exploration data, consisting of a three-dimensional heterogeneous graph representation and a GPT-based dynamics model. For each downstream task, a goal-oriented policy agent is trained to predict the subsequent actions, and SoftGPT generates the consequences of these actions. Integrating these two approaches establishes a thinking process in the robot's mind that provides rollout for facilitating policy learning. Our results demonstrate that leveraging prior knowledge through this thinking process can efficiently learn various soft object manipulation skills, with the potential for direct learning from human demonstrations.
GRNov 7, 2025
Neural Image Abstraction Using Long Smoothing B-SplinesDaniel Berio, Michael Stroh, Sylvain Calinon et al.
We integrate smoothing B-splines into a standard differentiable vector graphics (DiffVG) pipeline through linear mapping, and show how this can be used to generate smooth and arbitrarily long paths within image-based deep learning systems. We take advantage of derivative-based smoothing costs for parametric control of fidelity vs. simplicity tradeoffs, while also enabling stylization control in geometric and image spaces. The proposed pipeline is compatible with recent vector graphics generation and vectorization methods. We demonstrate the versatility of our approach with four applications aimed at the generation of stylized vector graphics: stylized space-filling path generation, stroke-based image abstraction, closed-area image abstraction, and stylized text generation.
ROMay 13
Ergodic Imitation for Adaptive Exploration around DemonstrationsZiyi Xu, Cem Bilaloglu, Yiming Li et al.
In robotics, a common challenge in imitation learning is the mismatch between training and deployment conditions, caused, for example, by environmental changes or imperfect observation and control. When a robot follows a nominal trajectory under such mismatch, it may become stuck and fail to complete the task. This calls for adaptive online exploration strategies that remain grounded in demonstrations. To this end, we propose an adaptive ergodic imitation approach that constructs a target distribution from the geometry of the retrieved demonstrations and uses it to generate trajectories that adaptively interpolate between tracking and exploration. Our method extends ergodic control beyond its traditional role in area-coverage and search by incorporating demonstrations into a retrieval-based receding-horizon framework for adaptive imitation.
RODec 19, 2024
Human-Humanoid Robots Cross-Embodiment Behavior-Skill Transfer Using Decomposed Adversarial Learning from DemonstrationJunjia Liu, Zhuo Li, Minghao Yu et al.
Humanoid robots are envisioned as embodied intelligent agents capable of performing a wide range of human-level loco-manipulation tasks, particularly in scenarios requiring strenuous and repetitive labor. However, learning these skills is challenging due to the high degrees of freedom of humanoid robots, and collecting sufficient training data for humanoid is a laborious process. Given the rapid introduction of new humanoid platforms, a cross-embodiment framework that allows generalizable skill transfer is becoming increasingly critical. To address this, we propose a transferable framework that reduces the data bottleneck by using a unified digital human model as a common prototype and bypassing the need for re-training on every new robot platform. The model learns behavior primitives from human demonstrations through adversarial imitation, and the complex robot structures are decomposed into functional components, each trained independently and dynamically coordinated. Task generalization is achieved through a human-object interaction graph, and skills are transferred to different robots via embodiment-specific kinematic motion retargeting and dynamic fine-tuning. Our framework is validated on five humanoid robots with diverse configurations, demonstrating stable loco-manipulation and highlighting its effectiveness in reducing data requirements and increasing the efficiency of skill transfer across platforms.
ROMar 19, 2025
CCDP: Composition of Conditional Diffusion Policies with Guided SamplingAmirreza Razmjoo, Sylvain Calinon, Michael Gienger et al.
Imitation Learning offers a promising approach to learn directly from data without requiring explicit models, simulations, or detailed task definitions. During inference, actions are sampled from the learned distribution and executed on the robot. However, sampled actions may fail for various reasons, and simply repeating the sampling step until a successful action is obtained can be inefficient. In this work, we propose an enhanced sampling strategy that refines the sampling distribution to avoid previously unsuccessful actions. We demonstrate that by solely utilizing data from successful demonstrations, our method can infer recovery actions without the need for additional exploratory behavior or a high-level controller. Furthermore, we leverage the concept of diffusion model decomposition to break down the primary problem, which may require long-horizon history to manage failures, into multiple smaller, more manageable sub-problems in learning, data collection, and inference, thereby enabling the system to adapt to variable failure counts. Our approach yields a low-level controller that dynamically adjusts its sampling space to improve efficiency when prior samples fall short. We validate our method across several tasks, including door opening with unknown directions, object manipulation, and button-searching scenarios, demonstrating that our approach outperforms traditional baselines.
RONov 8, 2024
A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world RoboticsPuze Liu, Jonas Günster, Niklas Funk et al.
Machine learning methods have a groundbreaking impact in many application domains, but their application on real robotic platforms is still limited. Despite the many challenges associated with combining machine learning technology with robotics, robot learning remains one of the most promising directions for enhancing the capabilities of robots. When deploying learning-based approaches on real robots, extra effort is required to address the challenges posed by various real-world factors. To investigate the key factors influencing real-world deployment and to encourage original solutions from different researchers, we organized the Robot Air Hockey Challenge at the NeurIPS 2023 conference. We selected the air hockey task as a benchmark, encompassing low-level robotics problems and high-level tactics. Different from other machine learning-centric benchmarks, participants need to tackle practical challenges in robotics, such as the sim-to-real gap, low-level control issues, safety problems, real-time requirements, and the limited availability of real-world data. Furthermore, we focus on a dynamic environment, removing the typical assumption of quasi-static motions of other real-world benchmarks. The competition's results show that solutions combining learning-based approaches with prior knowledge outperform those relying solely on data when real-world deployment is challenging. Our ablation study reveals which real-world factors may be overlooked when building a learning-based solution. The successful real-world air hockey deployment of best-performing agents sets the foundation for future competitions and follow-up research directions.
ROOct 15, 2024
Learning Goal-oriented Bimanual Dough Rolling Using Dynamic Heterogeneous Graph Based on Human DemonstrationJunjia Liu, Chenzui Li, Shixiong Wang et al.
Soft object manipulation poses significant challenges for robots, requiring effective techniques for state representation and manipulation policy learning. State representation involves capturing the dynamic changes in the environment, while manipulation policy learning focuses on establishing the relationship between robot actions and state transformations to achieve specific goals. To address these challenges, this research paper introduces a novel approach: a dynamic heterogeneous graph-based model for learning goal-oriented soft object manipulation policies. The proposed model utilizes graphs as a unified representation for both states and policy learning. By leveraging the dynamic graph, we can extract crucial information regarding object dynamics and manipulation policies. Furthermore, the model facilitates the integration of demonstrations, enabling guided policy learning. To evaluate the efficacy of our approach, we designed a dough rolling task and conducted experiments using both a differentiable simulator and a real-world humanoid robot. Additionally, several ablation studies were performed to analyze the effect of our method, demonstrating its superiority in achieving human-like behavior.
ROJul 5, 2021
Online and Offline Robot Programming via Augmented Reality WorkspacesYong Joon Thoo, Jérémy Maceiras, Philip Abbet et al.
Robot programming methods for industrial robots are time consuming and often require operators to have knowledge in robotics and programming. To reduce costs associated with reprogramming, various interfaces using augmented reality have recently been proposed to provide users with more intuitive means of controlling robots in real-time and programming them without having to code. However, most solutions require the operator to be close to the real robot's workspace which implies either removing it from the production line or shutting down the whole production line due to safety hazards. We propose a novel augmented reality interface providing the users with the ability to model a virtual representation of a workspace which can be saved and reused to program new tasks or adapt old ones without having to be co-located with the real robot. Similar to previous interfaces, the operators then have the ability to program robot tasks or control the robot in real-time by manipulating a virtual robot. We evaluate the intuitiveness and usability of the proposed interface with a user study where 18 participants programmed a robot manipulator for a disassembly task.
ROApr 21, 2021
Mixture Models for the Analysis, Edition, and Synthesis of Continuous Time SeriesSylvain Calinon
This chapter presents an overview of techniques used for the analysis, edition, and synthesis of time series, with a particular emphasis on motion data. The use of mixture models allows the decomposition of time signals as a superposition of basis functions. It provides a compact representation that aims at keeping the essential characteristics of the signals. Various types of basis functions have been proposed, with developments originating from different fields of research, including computer graphics, human motion science, robotics, control, and neuroscience. Examples of applications with radial, Bernstein and Fourier basis functions will be presented, with associated source codes to get familiar with these techniques.
ROJan 12, 2021
Ergodic Exploration using Tensor Train: Applications in Insertion TasksSuhan Shetty, João Silvério, Sylvain Calinon
In robotics, ergodic control extends the tracking principle by specifying a probability distribution over an area to cover instead of a trajectory to track. The original problem is formulated as a spectral multiscale coverage problem, typically requiring the spatial distribution to be decomposed as Fourier series. This approach does not scale well to control problems requiring exploration in search space of more than 2 dimensions. To address this issue, we propose the use of tensor trains, a recent low-rank tensor decomposition technique from the field of multilinear algebra. The proposed solution is efficient, both computationally and storage-wise, hence making it suitable for its online implementation in robotic systems. The approach is applied to a peg-in-hole insertion task requiring full 6D end-effector poses, implemented with a 7-axis Franka Emika Panda robot. In this experiment, ergodic exploration allows the task to be achieved without requiring the use of force/torque sensors.
RODec 11, 2020
Probabilistic Iterative LQR for Short Time Horizon MPCTeguh Santoso Lembono, Sylvain Calinon
Optimal control is often used in robotics for planning a trajectory to achieve some desired behavior, as expressed by the cost function. Most works in optimal control focus on finding a single optimal trajectory, which is then typically tracked by another controller. In this work, we instead consider trajectory distribution as the solution of an optimal control problem, resulting in better tracking performance and a more stable controller. A Gaussian distribution is first obtained from an iterative Linear Quadratic Regulator (iLQR) solver. A short horizon Model Predictive Control (MPC) is then used to track this distribution. We show that tracking the distribution is more cost-efficient and robust as compared to tracking the mean or using iLQR feedback control. The proposed method is validated with kinematic control of 7-DoF Panda manipulator and dynamic control of 6-DoF quadcopter in simulation.
RODec 11, 2020
Motion Mappings for Continuous Bilateral TeleoperationXiao Gao, João Silvério, Emmanuel Pignat et al.
Mapping operator motions to a robot is a key problem in teleoperation. Due to differences between workspaces, such as object locations, it is particularly challenging to derive smooth motion mappings that fulfill different goals (e.g. picking objects with different poses on the two sides or passing through key points). Indeed, most state-of-the-art methods rely on mode switches, leading to a discontinuous, low-transparency experience. In this paper, we propose a unified formulation for position, orientation and velocity mappings based on the poses of objects of interest in the operator and robot workspaces. We apply it in the context of bilateral teleoperation. Two possible implementations to achieve the proposed mappings are studied: an iterative approach based on locally-weighted translations and rotations, and a neural network approach. Evaluations are conducted both in simulation and using two torque-controlled Franka Emika Panda robots. Our results show that, despite longer training times, the neural network approach provides faster mapping evaluations and lower interaction forces for the operator, which are crucial for continuous, real-time teleoperation.
RONov 11, 2020
Learning Constrained Distributions of Robot Configurations with Generative Adversarial NetworkTeguh Santoso Lembono, Emmanuel Pignat, Julius Jankowski et al.
In high dimensional robotic system, the manifold of the valid configuration space often has a complex shape, especially under constraints such as end-effector orientation or static stability. We propose a generative adversarial network approach to learn the distribution of valid robot configurations under such constraints. It can generate configurations that are close to the constraint manifold. We present two applications of this method. First, by learning the conditional distribution with respect to the desired end-effector position, we can do fast inverse kinematics even for very high degrees of freedom (DoF) systems. Then, we use it to generate samples in sampling-based constrained motion planning algorithms to reduce the necessary projection steps, speeding up the computation. We validate the approach in simulation using the 7-DoF Panda manipulator and the 28-DoF humanoid robot Talos.
RONov 6, 2020
Generative adversarial training of product of policies for robust and adaptive movement primitivesEmmanuel Pignat, Hakan Girgin, Sylvain Calinon
In learning from demonstrations, many generative models of trajectories make simplifying assumptions of independence. Correctness is sacrificed in the name of tractability and speed of the learning phase. The ignored dependencies, which often are the kinematic and dynamic constraints of the system, are then only restored when synthesizing the motion, which introduces possibly heavy distortions. In this work, we propose to use those approximate trajectory distributions as close-to-optimal discriminators in the popular generative adversarial framework to stabilize and accelerate the learning procedure. The two problems of adaptability and robustness are addressed with our method. In order to adapt the motions to varying contexts, we propose to use a product of Gaussian policies defined in several parametrized task spaces. Robustness to perturbations and varying dynamics is ensured with the use of stochastic gradient descent and ensemble methods to learn the stochastic dynamics. Two experiments are performed on a 7-DoF manipulator to validate the approach.
RONov 3, 2020
A Laser-based Dual-arm System for Precise Control of Collaborative RobotsJoão Silvério, Guillaume Clivaz, Sylvain Calinon
Collaborative robots offer increased interaction capabilities at relatively low cost but in contrast to their industrial counterparts they inevitably lack precision. Moreover, in addition to the robots' own imperfect models, day-to-day operations entail various sources of errors that despite being small rapidly accumulate. This happens as tasks change and robots are re-programmed, often requiring time-consuming calibrations. These aspects strongly limit the application of collaborative robots in tasks demanding high precision (e.g. watch-making). We address this problem by relying on a dual-arm system with laser-based sensing to measure relative poses between objects of interest and compensate for pose errors coming from robot proprioception. Our approach leverages previous knowledge of object 3D models in combination with point cloud registration to efficiently extract relevant poses and compute corrective trajectories. This results in high-precision assembly behaviors. The approach is validated in a needle threading experiment, with a 150μm thread and a 300μm hole, and a USB insertion task using two 7-axis Panda robots.
ROOct 7, 2020
Learning from demonstration using products of experts: applications to manipulation and task prioritizationEmmanuel Pignat, João Silvério, Sylvain Calinon
Probability distributions are key components of many learning from demonstration (LfD) approaches. While the configuration of a manipulator is defined by its joint angles, poses are often best explained within several task spaces. In many approaches, distributions within relevant task spaces are learned independently and only combined at the control level. This simplification implies several problems that are addressed in this work. We show that the fusion of models in different task spaces can be expressed as a product of experts (PoE), where the probabilities of the models are multiplied and renormalized so that it becomes a proper distribution of joint angles. Multiple experiments are presented to show that learning the different models jointly in the PoE framework significantly improves the quality of the model. The proposed approach particularly stands out when the robot has to learn competitive or hierarchical objectives. Training the model jointly usually relies on contrastive divergence, which requires costly approximations that can affect performance. We propose an alternative strategy using variational inference and mixture model approximations. In particular, we show that the proposed approach can be extended to PoE with a nullspace structure (PoENS), where the model is able to recover tasks that are masked by the resolution of higher-level objectives.
ROAug 6, 2020
Active Improvement of Control Policies with Bayesian Gaussian Mixture ModelHakan Girgin, Emmanuel Pignat, Noémie Jaquier et al.
Learning from demonstration (LfD) is an intuitive framework allowing non-expert users to easily (re-)program robots. However, the quality and quantity of demonstrations have a great influence on the generalization performances of LfD approaches. In this paper, we introduce a novel active learning framework in order to improve the generalization capabilities of control policies. The proposed approach is based on the epistemic uncertainties of Bayesian Gaussian mixture models (BGMMs). We determine the new query point location by optimizing a closed-form information-density cost based on the quadratic Rényi entropy. Furthermore, to better represent uncertain regions and to avoid local optima problem, we propose to approximate the active learning cost with a Gaussian mixture model (GMM). We demonstrate our active learning framework in the context of a reaching task in a cluttered environment with an illustrative toy example and a real experiment with a Panda robot.
ROAug 4, 2020
Analysis and Transfer of Human Movement Manipulability in Industry-like ActivitiesNoémie Jaquier, Leonel Rozo, Sylvain Calinon
Humans exhibit outstanding learning, planning and adaptation capabilities while performing different types of industrial tasks. Given some knowledge about the task requirements, humans are able to plan their limbs motion in anticipation of the execution of specific skills. For example, when an operator needs to drill a hole on a surface, the posture of her limbs varies to guarantee a stable configuration that is compatible with the drilling task specifications, e.g. exerting a force orthogonal to the surface. Therefore, we are interested in analyzing the human arms motion patterns in industrial activities. To do so, we build our analysis on the so-called manipulability ellipsoid, which captures a posture-dependent ability to perform motion and exert forces along different task directions. Through thorough analysis of the human movement manipulability, we found that the ellipsoid shape is task dependent and often provides more information about the human motion than classical manipulability indices. Moreover, we show how manipulability patterns can be transferred to robots by learning a probabilistic model and employing a manipulability tracking controller that acts on the task planning and execution according to predefined control hierarchies.
LGJul 1, 2020
Interaction-limited Inverse Reinforcement LearningMartin Troussard, Emmanuel Pignat, Parameswaran Kamalaruban et al.
This paper proposes an inverse reinforcement learning (IRL) framework to accelerate learning when the learner-teacher \textit{interaction} is \textit{limited} during training. Our setting is motivated by the realistic scenarios where a helpful teacher is not available or when the teacher cannot access the learning dynamics of the student. We present two different training strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL) focusing on the learner's perspective. Using experiments in simulations and experiments with a real robot learning a task from a human demonstrator, we show that our training strategies can allow a faster training than a random teacher for CIRL and than a batch learner for SPIRL.
ROMar 4, 2020
Learning,Generating and Adapting Wave Gestures for Expressive Human-Robot InteractionMihalis Panteris, Simon Manschitz, Sylvain Calinon
This study proposes a novel imitation learning approach for the stochastic generation of human-like rhythmic wave gestures and their modulation for effective non-verbal communication through a probabilistic formulation using joint angle data from human demonstrations. This is achieved by learning and modulating the overall expression characteristics of the gesture (e.g., arm posture, waving frequency and amplitude) in the frequency domain. The method was evaluated on simulated robot experiments involving a robot with a manipulator of 6 degrees of freedom. The results show that the method provides efficient encoding and modulation of rhythmic movements and ensures variability in their execution.
ROJan 31, 2020
A memory of motion for visual predictive control tasksAntonio Paolillo, Teguh Santoso Lembono, Sylvain Calinon
This paper addresses the problem of efficiently achieving visual predictive control tasks. To this end, a memory of motion, containing a set of trajectories built off-line, is used for leveraging precomputation and dealing with difficult visual tasks. Standard regression techniques, such as k-nearest neighbors and Gaussian process regression, are used to query the memory and provide on-line a warm-start and a way point to the control optimization process. The proposed technique allows the control scheme to achieve high performance and, at the same time, keep the computational time limited. Simulation and experimental results, carried out with a 7-axis manipulator, show the effectiveness of the approach.
ROJan 31, 2020
Learning How to Walk: Warm-starting Optimal Control Solver with Memory of MotionTeguh Santoso Lembono, Carlos Mastalli, Pierre Fernbach et al.
In this paper, we propose a framework to build a memory of motion for warm-starting an optimal control solver for the locomotion task of a humanoid robot. We use HPP Loco3D, a versatile locomotion planner, to generate offline a set of dynamically consistent whole-body trajectory to be stored as the memory of motion. The learning problem is formulated as a regression problem to predict a single-step motion given the desired contact locations, which is used as a building block for producing multi-step motions. The predicted motion is then used as a warm-start for the fast optimal control solver Crocoddyl. We have shown that the approach manages to reduce the required number of iterations to reach the convergence from $\sim$9.5 to only $\sim$3.0 iterations for the single-step motion and from $\sim$6.2 to $\sim$4.5 iterations for the multi-step motion, while maintaining the solution's quality.
ROOct 11, 2019
Learning from demonstration with model-based Gaussian processNoémie Jaquier, David Ginsbourger, Sylvain Calinon
In learning from demonstrations, it is often desirable to adapt the behavior of the robot as a function of the variability retrieved from human demonstrations and the (un)certainty encoded in different parts of the task. In this paper, we propose a novel multi-output Gaussian process (MOGP) based on Gaussian mixture regression (GMR). The proposed approach encapsulates the variability retrieved from the demonstrations in the covariance of the MOGP. Leveraging the generative nature of GP models, our approach can efficiently modulate trajectories towards new start-, via- or end-points defined by the task. Our framework allows the robot to precisely track via-points while being compliant in regions of high variability. We illustrate the proposed approach in simulated examples and validate it in a real-robot experiment.
ROOct 11, 2019
Bayesian Optimization Meets Riemannian Manifolds in Robot LearningNoémie Jaquier, Leonel Rozo, Sylvain Calinon et al.
Bayesian optimization (BO) recently became popular in robotics to optimize control parameters and parametric policies in direct reinforcement learning due to its data efficiency and gradient-free approach. However, its performance may be seriously compromised when the parameter space is high-dimensional. A way to tackle this problem is to introduce domain knowledge into the BO framework. We propose to exploit the geometry of non-Euclidean parameter spaces, which often arise in robotics (e.g. orientation, stiffness matrix). Our approach, built on Riemannian manifold theory, allows BO to properly measure similarities in the parameter space through geometry-aware kernel functions and to optimize the acquisition function on the manifold as an unconstrained problem. We test our approach in several benchmark artificial landscapes and using a 7-DOF simulated robot to learn orientation and impedance parameters for manipulation skills.
ROSep 12, 2019
Gaussians on Riemannian Manifolds: Applications for Robot Learning and Adaptive ControlSylvain Calinon
This article presents an overview of robot learning and adaptive control applications that can benefit from a joint use of Riemannian geometry and probabilistic representations. The roles of Riemannian manifolds, geodesics and parallel transport in robotics are first discussed. Several forms of manifolds already employed in robotics are then presented, by also listing manifolds that have been underexploited but that have potentials in future robot learning applications. A varied range of techniques employing Gaussian distributions on Riemannian manifolds is then introduced, including clustering, regression, information fusion, planning and control problems. Two examples of applications are presented, involving the control of a prosthetic hand from surface electromyography (sEMG) data, and the teleoperation of a bimanual underwater robot. Further perspectives are finally discussed, with suggestions of promising research directions.
ROJul 2, 2019
Memory of Motion for Warm-starting Trajectory OptimizationTeguh Santoso Lembono, Antonio Paolillo, Emmanuel Pignat et al.
Trajectory optimization for motion planning requires good initial guesses to obtain good performance. In our proposed approach, we build a memory of motion based on a database of robot paths to provide good initial guesses. The memory of motion relies on function approximators and dimensionality reduction techniques to learn the mapping between the tasks and the robot paths. Three function approximators are compared: $k$-Nearest Neighbor, Gaussian Process Regression, and Bayesian Gaussian Mixture Regression. In addition, we show that the memory can be used as a metric to choose between several possible goals, and using an ensemble method to combine different function approximators results in a significantly improved warm-starting performance. We demonstrate the proposed approach with motion planning examples on the dual-arm robot PR2 and the humanoid robot Atlas.
ROMay 23, 2019
Nullspace Structure in Model Predictive ControlHakan Girgin, Sylvain Calinon
Robotic tasks can be accomplished by exploiting different forms of redundancies. This work focuses on planning redundancy within Model Predictive Control (MPC) in which several paths can be considered within the MPC time horizon. We present the nullspace structure in MPC with a quadratic approximation of the cost and a linearization of the dynamics. We exploit the low rank structure of the precision matrices used in MPC (encapsulating spatiotemporal information) to perform hierarchical task planning, and show how nullspace computation can be treated as a fusion problem (computed with a product of Gaussian experts). We illustrate the approach using proof-of-concept examples with point mass objects and simulated robotics applications.
ROMay 23, 2019
Variational Inference with Mixture Model Approximation: Robotic ApplicationsEmmanuel Pignat, Teguh Lembono, Sylvain Calinon
We propose a method to approximate the distribution of robot configurations satisfying multiple objectives. Our approach uses variational inference, a popular method in Bayesian computation, which has several advantages over sampling-based techniques. To be able to represent the complex and multimodal distribution of configurations, we propose to use a mixture model as approximate distribution, an approach that has gained popularity recently. In this work, we show the interesting properties of this approach and how it can be applied to a wide range of problems in robotics.
ROApr 24, 2019
Bayesian Gaussian mixture model for robotic policy imitationEmmanuel Pignat, Sylvain Calinon
A common approach to learn robotic skills is to imitate a demonstrated policy. Due to the compounding of small errors and perturbations, this approach may let the robot leave the states in which the demonstrations were provided. This requires the consideration of additional strategies to guarantee that the robot will behave appropriately when facing unknown states. We propose to use a Bayesian method to quantify the action uncertainty at each state. The proposed Bayesian method is simple to set up, computationally efficient, and can adapt to a wide range of problems. Our approach exploits the estimated uncertainty to fuse the imitation policy with additional policies. It is validated on a Panda robot with the imitation of three manipulation tasks in the continuous domain using different control input/state pairs.
ROFeb 28, 2019
Tensor-variate Mixture of Experts for Proportional Myographic Control of a Robotic HandNoémie Jaquier, Robert Haschke, Sylvain Calinon
When data are organized in matrices or arrays of higher dimensions (tensors), classical regression methods first transform these data into vectors, therefore ignoring the underlying structure of the data and increasing the dimensionality of the problem. This flattening operation typically leads to overfitting when only few training data is available. In this paper, we present a mixture-of-experts model that exploits tensorial representations for regression of tensor-valued data. The proposed formulation takes into account the underlying structure of the data and remains efficient when few training data are available. Evaluation on artificially generated data, as well as offline and real-time experiments recognizing hand movements from tactile myography prove the effectiveness of the proposed approach.
ROFeb 19, 2019
Improving dual-arm assembly by master-slave complianceMarkku Suomalainen, Sylvain Calinon, Emmanuel Pignat et al.
In this paper we show how different choices regarding compliance affect a dual-arm assembly task. In addition, we present how the compliance parameters can be learned from a human demonstration. Compliant motions can be used in assembly tasks to mitigate pose errors originating from, for example, inaccurate grasping. We present analytical background and accompanying experimental results on how to choose the center of compliance to enhance the convergence region of an alignment task. Then we present the possible ways of choosing the compliant axes for accomplishing alignment in a scenario where orientation error is present. We show that an earlier presented Learning from Demonstration method can be used to learn motion and compliance parameters of an impedance controller for both manipulators. The learning requires a human demonstration with a single teleoperated manipulator only, easing the execution of demonstration and enabling usage of manipulators at difficult locations as well. Finally, we experimentally verify our claim that having both manipulators compliant in both rotation and translation can accomplish the alignment task with less total joint motions and in shorter time than moving one manipulator only. In addition, we show that the learning method produces the parameters that achieve the best results in our experiments.
RONov 27, 2018
Geometry-aware Manipulability Learning, Tracking and TransferNoémie Jaquier, Leonel Rozo, Darwin G. Caldwell et al.
Body posture influences human and robots performance in manipulation tasks, as appropriate poses facilitate motion or force exertion along different axes. In robotics, manipulability ellipsoids arise as a powerful descriptor to analyze, control and design the robot dexterity as a function of the articulatory joint configuration. This descriptor can be designed according to different task requirements, such as tracking a desired position or apply a specific force. In this context, this paper presents a novel \emph{manipulability transfer} framework, a method that allows robots to learn and reproduce manipulability ellipsoids from expert demonstrations. The proposed learning scheme is built on a tensor-based formulation of a Gaussian mixture model that takes into account that manipulability ellipsoids lie on the manifold of symmetric positive definite matrices. Learning is coupled with a geometry-aware tracking controller allowing robots to follow a desired profile of manipulability ellipsoids. Extensive evaluations in simulation with redundant manipulators, a robotic hand and humanoids agents, as well as an experiment with two real dual-arm systems validate the feasibility of the approach.
RONov 19, 2018
Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov ModelsAjay Kumar Tanwani, Jonathan Lee, Brijen Thananjeyan et al.
Generalizing manipulation skills to new situations requires extracting invariant patterns from demonstrations. For example, the robot needs to understand the demonstrations at a higher level while being invariant to the appearance of the objects, geometric aspects of objects such as its position, size, orientation and viewpoint of the observer in the demonstrations. In this paper, we propose an algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments (also termed as sub-goals or options), and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The algorithm takes as input the demonstrations with respect to different coordinate systems describing virtual landmarks or objects of interest with a task-parameterized formulation, and adapt the segments according to the environmental changes in a systematic manner. We present variants of this algorithm in latent space with low-rank covariance decompositions, semi-tied covariances, and non-parametric online estimation of model parameters under small variance asymptotics; yielding considerably low sample and model complexity for acquiring new manipulation skills. The algorithm allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.
ROJul 6, 2018
A survey on policy search algorithms for learning robot controllers in a handful of trialsKonstantinos Chatzilygeroudis, Vassilis Vassiliades, Freek Stulp et al.
Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word "big-data", we refer to this challenge as "micro-data reinforcement learning". We show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based policy search), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots (e.g., humanoids), designing generic priors, and optimizing the computing time.
RODec 19, 2017
Probabilistic Learning of Torque Controllers from Kinematic and Force ConstraintsJoão Silvério, Yanlong Huang, Leonel Rozo et al.
When learning skills from demonstrations, one is often required to think in advance about the appropriate task representation (usually in either operational or configuration space). We here propose a probabilistic approach for simultaneously learning and synthesizing torque control commands which take into account task space, joint space and force constraints. We treat the problem by considering different torque controllers acting on the robot, whose relevance is learned probabilistically from demonstrations. This information is used to combine the controllers by exploiting the properties of Gaussian distributions, generating new torque commands that satisfy the important features of the task. We validate the approach in two experimental scenarios using 7-DoF torquecontrolled manipulators, with tasks that require the consideration of different controllers to be properly executed.
ROJul 21, 2017
Learning Task Priorities from DemonstrationsJoão Silvério, Sylvain Calinon, Leonel Rozo et al.
Bimanual operations in humanoids offer the possibility to carry out more than one manipulation task at the same time, which in turn introduces the problem of task prioritization. We address this problem from a learning from demonstration perspective, by extending the Task-Parameterized Gaussian Mixture Model (TP-GMM) to Jacobian and null space structures. The proposed approach is tested on bimanual skills but can be applied in any scenario where the prioritization between potentially conflicting tasks needs to be learned. We evaluate the proposed framework in: two different tasks with humanoids requiring the learning of priorities and a loco-manipulation scenario, showing that the approach can be exploited to learn the prioritization of multiple tasks in parallel.
ROOct 8, 2016
Small Variance Asymptotics for Non-Parametric Online Robot LearningAjay Kumar Tanwani, Sylvain Calinon
Small variance asymptotics is emerging as a useful technique for inference in large scale Bayesian non-parametric mixture models. This paper analyses the online learning of robot manipulation tasks with Bayesian non-parametric mixture models under small variance asymptotics. The analysis yields a scalable online sequence clustering (SOSC) algorithm that is non-parametric in the number of clusters and the subspace dimension of each cluster. SOSC groups the new datapoint in its low dimensional subspace by online inference in a non-parametric mixture of probabilistic principal component analyzers (MPPCA) based on Dirichlet process, and captures the state transition and state duration information online in a hidden semi-Markov model (HSMM) based on hierarchical Dirichlet process. A task-parameterized formulation of our approach autonomously adapts the model to changing environmental situations during manipulation. We apply the algorithm in a teleoperation setting to recognize the intention of the operator and remotely adjust the movement of the robot using the learned model. The generative model is used to synthesize both time-independent and time-dependent behaviours by relying on the principles of shared and autonomous control. Experiments with the Baxter robot yield parsimonious clusters that adapt online with new demonstrations and assist the operator in performing remote manipulation tasks.