ROOct 14, 2023Code
Benchmarking the Sim-to-Real Gap in Cloth ManipulationDavid Blanco-Mulero, Oriol Barbany, Gokhan Alcan et al.
Realistic physics engines play a crucial role for learning to manipulate deformable objects such as garments in simulation. By doing so, researchers can circumvent challenges such as sensing the deformation of the object in the realworld. In spite of the extensive use of simulations for this task, few works have evaluated the reality gap between deformable object simulators and real-world data. We present a benchmark dataset to evaluate the sim-to-real gap in cloth manipulation. The dataset is collected by performing a dynamic as well as a quasi-static cloth manipulation task involving contact with a rigid table. We use the dataset to evaluate the reality gap, computational time, and simulation stability of four popular deformable object simulators: MuJoCo, Bullet, Flex, and SOFA. Additionally, we discuss the benefits and drawbacks of each simulator. The benchmark dataset is open-source. Supplementary material, videos, and code, can be found at https://sites.google.com/view/cloth-sim2real-benchmark.
ROMar 23, 2023
QDP: Learning to Sequentially Optimise Quasi-Static and Dynamic Manipulation Primitives for Robotic Cloth ManipulationDavid Blanco-Mulero, Gokhan Alcan, Fares J. Abu-Dakka et al.
Pre-defined manipulation primitives are widely used for cloth manipulation. However, cloth properties such as its stiffness or density can highly impact the performance of these primitives. Although existing solutions have tackled the parameterisation of pick and place locations, the effect of factors such as the velocity or trajectory of quasi-static and dynamic manipulation primitives has been neglected. Choosing appropriate values for these parameters is crucial to cope with the range of materials present in house-hold cloth objects. To address this challenge, we introduce the Quasi-Dynamic Parameterisable (QDP) method, which optimises parameters such as the motion velocity in addition to the pick and place positions of quasi-static and dynamic manipulation primitives. In this work, we leverage the framework of Sequential Reinforcement Learning to decouple sequentially the parameters that compose the primitives. To evaluate the effectiveness of the method we focus on the task of cloth unfolding with a robotic arm in simulation and real-world experiments. Our results in simulation show that by deciding the optimal parameters for the primitives the performance can improve by 20% compared to sub-optimal ones. Real-world results demonstrate the advantage of modifying the velocity and height of manipulation primitives for cloths with different mass, stiffness, shape and size. Supplementary material, videos, and code, can be found at https://sites.google.com/view/qdp-srl.
11.1SYApr 21
Adaptive Modular Geometric Control of Robotic ManipulatorsMahdi Hejrati, Amir Hossein Barjini, Gokhan Alcan et al.
This paper proposes an adaptive modular geometric control framework for robotic manipulators. The proposed methodology decomposes the overall manipulator dynamics into individual modules, enabling the design of local geometric control laws at the module level. To address parametric uncertainties, geometric adaptation law is incorporated into the control structure, requiring only a single adaptation gain for the entire system while ensuring physically consistent and drift-free parameter estimates. Exponential stability of the proposed controller is established in the nominal case. Numerical simulations on a complex redundant robotic manipulator are conducted to evaluate the proposed approach against existing modular and geometric control methods. The results show that the proposed method reduces the RMS position error by at least 12.2% compared with state-of-the-art controllers under almost the same control effort. In addition, the adaptive extension demonstrates strong capability in compensating for parametric uncertainties and preserving high tracking performance.
LGMar 22, 2024
Automated Feature Selection for Inverse Reinforcement LearningDaulet Baimukashev, Gokhan Alcan, Ville Kyrki
Inverse reinforcement learning (IRL) is an imitation learning approach to learning reward functions from expert demonstrations. Its use avoids the difficult and tedious procedure of manual reward specification while retaining the generalization power of reinforcement learning. In IRL, the reward is usually represented as a linear combination of features. In continuous state spaces, the state variables alone are not sufficiently rich to be used as features, but which features are good is not known in general. To address this issue, we propose a method that employs polynomial basis functions to form a candidate set of features, which are shown to allow the matching of statistical moments of state distributions. Feature selection is then performed for the candidates by leveraging the correlation between trajectory probabilities and feature expectations. We demonstrate the approach's effectiveness by recovering reward functions that capture expert policies across non-linear control tasks of increasing complexity. Code, data, and videos are available at https://sites.google.com/view/feature4irl.
LGJan 9, 2024
The Role of Higher-Order Cognitive Models in Active LearningOskar Keurulainen, Gokhan Alcan, Ville Kyrki
Building machines capable of efficiently collaborating with humans has been a longstanding goal in artificial intelligence. Especially in the presence of uncertainties, optimal cooperation often requires that humans and artificial agents model each other's behavior and use these models to infer underlying goals, beliefs or intentions, potentially involving multiple levels of recursion. Empirical evidence for such higher-order cognition in human behavior is also provided by previous works in cognitive science, linguistics, and robotics. We advocate for a new paradigm for active learning for human feedback that utilises humans as active data sources while accounting for their higher levels of agency. In particular, we discuss how increasing level of agency results in qualitatively different forms of rational communication between an active learning system and a teacher. Additionally, we provide a practical example of active learning using a higher-order cognitive model. This is accompanied by a computational study that underscores the unique behaviors that this model produces.
ROSep 10, 2021
Learning Visual Feedback Control for Dynamic Cloth FoldingJulius Hietala, David Blanco-Mulero, Gokhan Alcan et al.
Robotic manipulation of cloth is a challenging task due to the high dimensionality of the configuration space and the complexity of dynamics affected by various material properties. The effect of complex dynamics is even more pronounced in dynamic folding, for example, when a square piece of fabric is folded in two by a single manipulator. To account for the complexity and uncertainties, feedback of the cloth state using e.g. vision is typically needed. However, construction of visual feedback policies for dynamic cloth folding is an open problem. In this paper, we present a solution that learns policies in simulation using Reinforcement Learning (RL) and transfers the learned policies directly to the real world. In addition, to learn a single policy that manipulates multiple materials, we randomize the material properties in simulation. We evaluate the contributions of visual feedback and material randomization in real-world experiments. The experimental results demonstrate that the proposed solution can fold successfully different fabric types using dynamic manipulation in the real world. Code, data, and videos are available at https://sites.google.com/view/dynamic-cloth-folding
ROMar 31, 2021
Planning for Safe Abortable Overtaking Maneuvers in Autonomous DrivingJiyo Palatti, Andrei Aksjonov, Gokhan Alcan et al.
Overtaking is one of the most challenging tasks in driving, and the current solutions to autonomous overtaking are limited to simple and static scenarios. In this paper, we present a method for behaviour and trajectory planning for safe autonomous overtaking. The proposed method optimizes the trajectory by simultaneously enforcing safety and minimizing intrusion onto the adjacent lane. Furthermore, the method allows the overtaking to be aborted, enabling the autonomous vehicle to merge back in the lane, if safety is compromised, because of e.g. traffic in opposing direction appearing during the maneuver execution. A finite state machine is used to select an appropriate maneuver at each time, and a combination of safe and reachable sets is used to iteratively generate intermediate reference targets based on the current maneuver. A nonlinear model predictive controller then plans dynamically feasible and collision-free trajectories to these intermediate reference targets. Simulation experiments demonstrate that the combination of intermediate reference generation and model predictive control is able to handle multiple behaviors, including following a lead vehicle, overtaking and aborting the overtake, within a single framework.
RONov 2, 2020
Differential Dynamic Programming with Nonlinear Safety Constraints Under System UncertaintiesGokhan Alcan, Ville Kyrki
Safe operation of systems such as robots requires them to plan and execute trajectories subject to safety constraints. When those systems are subject to uncertainties in their dynamics, it is challenging to ensure that the constraints are not violated. In this paper, we propose Safe-CDDP, a safe trajectory optimization and control approach for systems under additive uncertainties and non-linear safety constraints based on constrained differential dynamic programming (DDP). The safety of the robot during its motion is formulated as chance constraints with user-chosen probabilities of constraint satisfaction. The chance constraints are transformed into deterministic ones in DDP formulation by constraint tightening. To avoid over-conservatism during constraint tightening, linear control gains of the feedback policy derived from the constrained DDP are used in the approximation of closed-loop uncertainty propagation in prediction. The proposed algorithm is empirically evaluated on three different robot dynamics with up to 12 degrees of freedom in simulation. The computational feasibility and applicability of the approach are demonstrated with a physical hardware implementation.