Olivier Stasse

RO
h-index45
6papers
115citations
Novelty57%
AI Score42

6 Papers

56.5ROMay 22
Direct Dynamic Retargeting for Humanoid Imitation Learning from Videos

Constant Roux, Ludovic De Matteïs, Armand Jordana et al.

Imitation Learning from monocular video demonstrations provides a scalable approach for teaching complex skills to humanoid robots. However, translating human motion to humanoids requires overcoming significant morphological mismatches. Standard approaches rely on Geometric Retargeting or Indirect Dynamic Retargeting pipelines. We identify that these intermediate kinematic projections introduce a geometric bias, restricting the search space and yielding suboptimal dynamic behaviors. In this paper, we propose Direct Dynamic Retargeting (DDR), a novel single-stage framework that generates high-fidelity, dynamically feasible trajectories directly from expert videos. By formulating the problem in the task space and leveraging a sampling-based Model Predictive Control solver within a physics simulator, DDR natively optimizes over complex contact sequences while mitigating input drift. Our experiments demonstrate that bypassing the geometric bias allows DDR to outperform state-of-the-art baselines in demonstration tracking accuracy. Furthermore, we establish that providing such physically viable references to RL agents accelerates training convergence and enhances the final execution of agile and balancing behaviors. Source code will be made publicly available.

ROMar 27, 2024
CaT: Constraints as Terminations for Legged Locomotion Reinforcement Learning

Elliot Chane-Sane, Pierre-Alexandre Leziart, Thomas Flayols et al.

Deep Reinforcement Learning (RL) has demonstrated impressive results in solving complex robotic tasks such as quadruped locomotion. Yet, current solvers fail to produce efficient policies respecting hard constraints. In this work, we advocate for integrating constraints into robot learning and present Constraints as Terminations (CaT), a novel constrained RL algorithm. Departing from classical constrained RL formulations, we reformulate constraints through stochastic terminations during policy learning: any violation of a constraint triggers a probability of terminating potential future rewards the RL agent could attain. We propose an algorithmic approach to this formulation, by minimally modifying widely used off-the-shelf RL algorithms in robot learning (such as Proximal Policy Optimization). Our approach leads to excellent constraint adherence without introducing undue complexity and computational overhead, thus mitigating barriers to broader adoption. Through empirical evaluation on the real quadruped robot Solo crossing challenging obstacles, we demonstrate that CaT provides a compelling solution for incorporating constraints into RL frameworks. Videos and code are available at https://constraints-as-terminations.github.io.

RODec 5, 2024
Reinforcement Learning from Wild Animal Videos

Elliot Chane-Sane, Constant Roux, Olivier Stasse et al.

We propose to learn legged robot locomotion skills by watching thousands of wild animal videos from the internet, such as those featured in nature documentaries. Indeed, such videos offer a rich and diverse collection of plausible motion examples, which could inform how robots should move. To achieve this, we introduce Reinforcement Learning from Wild Animal Videos (RLWAV), a method to ground these motions into physical robots. We first train a video classifier on a large-scale animal video dataset to recognize actions from RGB clips of animals in their natural habitats. We then train a multi-skill policy to control a robot in a physics simulator, using the classification score of a third-person camera capturing videos of the robot's movements as a reward for reinforcement learning. Finally, we directly transfer the learned policy to a real quadruped Solo. Remarkably, despite the extreme gap in both domain and embodiment between animals in the wild and robots, our approach enables the policy to learn diverse skills such as walking, jumping, and keeping still, without relying on reference trajectories nor skill-specific rewards.

ROJan 26, 2021
Design, analysis and control of the series-parallel hybrid RH5 humanoid robot

Julian Esser, Shivesh Kumar, Heiner Peters et al.

Last decades of humanoid research has shown that humanoids developed for high dynamic performance require a stiff structure and optimal distribution of mass--inertial properties. Humanoid robots built with a purely tree type architecture tend to be bulky and usually suffer from velocity and force/torque limitations. This paper presents a novel series-parallel hybrid humanoid called RH5 which is 2 m tall and weighs only 62.5 kg capable of performing heavy-duty dynamic tasks with 5 kg payloads in each hand. The analysis and control of this humanoid is performed with whole-body trajectory optimization technique based on differential dynamic programming (DDP). Additionally, we present an improved contact stability soft-constrained DDP algorithm which is able to generate physically consistent walking trajectories for the humanoid that can be tracked via a simple PD position control in a physics simulator. Finally, we showcase preliminary experimental results on the RH5 humanoid robot.

ROSep 14, 2018
Motion Planning in Irreducible Path Spaces

Andreas Orthey, Olivier Roussel, Olivier Stasse et al.

The motion of a mechanical system can be defined as a path through its configuration space. Computing such a path has a computational complexity scaling exponentially with the dimensionality of the configuration space. We propose to reduce the dimensionality of the configuration space by introducing the irreducible path --- a path having a minimal swept volume. The paper consists of three parts: In part I, we define the space of all irreducible paths and show that planning a path in the irreducible path space preserves completeness of any motion planning algorithm. In part II, we construct an approximation to the irreducible path space of a serial kinematic chain under certain assumptions. In part III, we conduct motion planning using the irreducible path space for a mechanical snake in a turbine environment, for a mechanical octopus with eight arms in a pipe system and for the sideways motion of a humanoid robot moving through a room with doors and through a hole in a wall. We demonstrate that the concept of an irreducible path can be applied to any motion planning algorithm taking curvature constraints into account.

ROSep 26, 2016
How do walkers avoid a mobile robot crossing their way?

Christian Vassallo, Anne-Hélène Olivier, Philippe Souères et al.

Robots and Humans have to share the same environment more and more often. In the aim of steering robots in a safe and convenient manner among humans it is required to understand how humans interact with them. This work focuses on collision avoidance between a human and a robot during locomotion. Having in mind previous results on human obstacle avoidance, as well as the description of the main principles which guide collision avoidance strategies, we observe how humans adapt a goal-directed locomotion task when they have to interfere with a mobile robot. Our results show differences in the strategy set by humans to avoid a robot in comparison with avoiding another human. Humans prefer to give the way to the robot even when they are likely to pass first at the beginning of the interaction.