69.9CLMar 14
Knowledge Distillation for Large Language ModelsAlejandro Paredes La Torre, Barbara Flores, Diego Rodriguez
We propose a resource-efficient framework for compressing large language models through knowledge distillation, combined with guided chain-of-thought reinforcement learning. Using Qwen 3B as the teacher and Qwen 0.5B as the student, we apply knowledge distillation across English Dolly-15k, Spanish Dolly-15k, and code BugNet and PyTorrent datasets, with hyperparameters tuned in the English setting to optimize student performance. Across tasks, the distilled student retains a substantial portion of the teacher's capability while remaining significantly smaller: 70% to 91% in English, up to 95% in Spanish, and up to 93.5% Rouge-L in code. For coding tasks, integrating chain-of-thought prompting with Group Relative Policy Optimization using CoT-annotated Codeforces data improves reasoning coherence and solution correctness compared to knowledge distillation alone. Post-training 4-bit weight quantization further reduces memory footprint and inference latency. These results show that knowledge distillation combined with chain-of-thought guided reinforcement learning can produce compact, efficient models suitable for deployment in resource-constrained settings.
ROOct 19, 2018Code
NimbRo-OP2X: Adult-sized Open-source 3D Printed Humanoid RobotGrzegorz Ficht, Hafez Farazi, André Brandenburger et al.
Humanoid robotics research depends on capable robot platforms, but recently developed advanced platforms are often not available to other research groups, expensive, dangerous to operate, or closed-source. The lack of available platforms forces researchers to work with smaller robots, which have less strict dynamic constraints or with simulations, which lack many real-world effects. We developed NimbRo-OP2X to address this need. At a height of 135 cm our robot is large enough to interact in a human environment. Its low weight of only 19 kg makes the operation of the robot safe and easy, as no special operational equipment is necessary. Our robot is equipped with a fast onboard computer and a GPU to accelerate parallel computations. We extend our already open-source software by a deep-learning based vision system and gait parameter optimisation. The NimbRo-OP2X was evaluated during RoboCup 2018 in Montréal, Canada, where it won all possible awards in the Humanoid AdultSize class.
ROAug 9, 2021
Mapless Humanoid Navigation Using Learned Latent DynamicsAndre Brandenburger, Diego Rodriguez, Sven Behnke
In this paper, we propose a novel Deep Reinforcement Learning approach to address the mapless navigation problem, in which the locomotion actions of a humanoid robot are taken online based on the knowledge encoded in learned models. Planning happens by generating open-loop trajectories in a learned latent space that captures the dynamics of the environment. Our planner considers visual (RGB images) and non-visual observations (e.g., attitude estimations). This confers the agent upon awareness not only of the scenario, but also of its own state. In addition, we incorporate a termination likelihood predictor model as an auxiliary loss function of the control policy, which enables the agent to anticipate terminal states of success and failure. In this manner, the sample efficiency of the approach for episodic tasks is increased. Our model is evaluated on the NimbRo-OP2X humanoid robot that navigates in scenes avoiding collisions efficiently in simulation and with the real hardware.
ROJun 1, 2021
DeepWalk: Omnidirectional Bipedal Gait by Deep Reinforcement LearningDiego Rodriguez, Sven Behnke
Bipedal walking is one of the most difficult but exciting challenges in robotics. The difficulties arise from the complexity of high-dimensional dynamics, sensing and actuation limitations combined with real-time and computational constraints. Deep Reinforcement Learning (DRL) holds the promise to address these issues by fully exploiting the robot dynamics with minimal craftsmanship. In this paper, we propose a novel DRL approach that enables an agent to learn omnidirectional locomotion for humanoid (bipedal) robots. Notably, the locomotion behaviors are accomplished by a single control policy (a single neural network). We achieve this by introducing a new curriculum learning method that gradually increases the task difficulty by scheduling target velocities. In addition, our method does not require reference motions which facilities its application to robots with different kinematics, and reduces the overall complexity. Finally, different strategies for sim-to-real transfer are presented which allow us to transfer the learned policy to a real humanoid robot.
ROOct 19, 2020
NimbRo-OP2X: Affordable Adult-sized 3D-printed Open-Source Humanoid Robot for ResearchGrzegorz Ficht, Hafez Farazi, Diego Rodriguez et al.
For several years, high development and production costs of humanoid robots restricted researchers interested in working in the field. To overcome this problem, several research groups have opted to work with simulated or smaller robots, whose acquisition costs are significantly lower. However, due to scale differences and imperfect simulation replicability, results may not be directly reproducible on real, adult-sized robots. In this paper, we present the NimbRo-OP2X, a capable and affordable adult-sized humanoid platform aiming to significantly lower the entry barrier for humanoid robot research. With a height of 135 cm and weight of only 19 kg, the robot can interact in an unmodified, human environment without special safety equipment. Modularity in hardware and software allow this platform enough flexibility to operate in different scenarios and applications with minimal effort. The robot is equipped with an on-board computer with GPU, which enables the implementation of state-of-the-art approaches for object detection and human perception demanded by areas such as manipulation and human-robot interaction. Finally, the capabilities of the NimbRo-OP2X, especially in terms of locomotion stability and visual perception, are evaluated. This includes the performance at RoboCup 2018, where NimbRo-OP2X won all possible awards in the AdultSize class.
CVAug 17, 2020
Category-Level 3D Non-Rigid Registration from Single-View RGB ImagesDiego Rodriguez, Florian Huber, Sven Behnke
In this paper, we propose a novel approach to solve the 3D non-rigid registration problem from RGB images using Convolutional Neural Networks (CNNs). Our objective is to find a deformation field (typically used for transferring knowledge between instances, e.g., grasping skills) that warps a given 3D canonical model into a novel instance observed by a single-view RGB image. This is done by training a CNN that infers a deformation field for the visible parts of the canonical model and by employing a learned shape (latent) space for inferring the deformations of the occluded parts. As result of the registration, the observed model is reconstructed. Because our method does not need depth information, it can register objects that are typically hard to perceive with RGB-D sensors, e.g. with transparent or shiny surfaces. Even without depth data, our approach outperforms the Coherent Point Drift (CPD) registration method for the evaluated object categories.
RODec 16, 2019
RoboCup 2019 AdultSize Winner NimbRo: Deep Learning Perception, In-Walk Kick, Push Recovery, and Team Play CapabilitiesDiego Rodriguez, Hafez Farazi, Grzegorz Ficht et al.
Individual and team capabilities are challenged every year by rule changes and the increasing performance of the soccer teams at RoboCup Humanoid League. For RoboCup 2019 in the AdultSize class, the number of players (2 vs. 2 games) and the field dimensions were increased, which demanded for team coordination and robust visual perception and localization modules. In this paper, we present the latest developments that lead team NimbRo to win the soccer tournament, drop-in games, technical challenges and the Best Humanoid Award of the RoboCup Humanoid League 2019 in Sydney. These developments include a deep learning vision system, in-walk kicks, step-based push-recovery, and team play strategies.
ROOct 1, 2019
Autonomous Bimanual Functional Regrasping of Novel Object Class InstancesDmytro Pavlichenko, Diego Rodriguez, Christian Lenz et al.
In human-made scenarios, robots need to be able to fully operate objects in their surroundings, i.e., objects are required to be functionally grasped rather than only picked. This imposes very strict constraints on the object pose such that a direct grasp can be performed. Inspired by the anthropomorphic nature of humanoid robots, we propose an approach that first grasps an object with one hand, obtaining full control over its pose, and performs the functional grasp with the second hand subsequently. Thus, we develop a fully autonomous pipeline for dual-arm functional regrasping of novel familiar objects, i.e., objects never seen before that belong to a known object category, e.g., spray bottles. This process involves semantic segmentation, object pose estimation, non-rigid mesh registration, grasp sampling, handover pose generation and in-hand pose refinement. The latter is used to compensate for the unpredictable object movement during the first grasp. The approach is applied to a human-like upper body. To the best knowledge of the authors, this is the first system that exhibits autonomous bimanual functional regrasping capabilities. We demonstrate that our system yields reliable success rates and can be applied on-line to real-world tasks using only one off-the-shelf RGB-D sensor.
ROSep 19, 2019
Flexible Disaster Response of Tomorrow -- Final Presentation and Evaluation of the CENTAURO SystemTobias Klamt, Diego Rodriguez, Lorenzo Baccelliere et al.
Mobile manipulation robots have high potential to support rescue forces in disaster-response missions. Despite the difficulties imposed by real-world scenarios, robots are promising to perform mission tasks from a safe distance. In the CENTAURO project, we developed a disaster-response system which consists of the highly flexible Centauro robot and suitable control interfaces including an immersive tele-presence suit and support-operator controls on different levels of autonomy. In this article, we give an overview of the final CENTAURO system. In particular, we explain several high-level design decisions and how those were derived from requirements and extensive experience of Kerntechnische Hilfsdienst GmbH, Karlsruhe, Germany (KHG). We focus on components which were recently integrated and report about a systematic evaluation which demonstrated system capabilities and revealed valuable insights.
ROSep 5, 2019
NimbRo Robots Winning RoboCup 2018 Humanoid AdultSize Soccer CompetitionsHafez Farazi, Grzegorz Ficht, Philipp Allgeuer et al.
Over the past few years, the Humanoid League rules have changed towards more realistic and challenging game environments, which encourage teams to advance their robot soccer performances. In this paper, we present the software and hardware designs that led our team NimbRo to win the competitions in the AdultSize league -- including the soccer tournament, the drop-in games, and the technical challenges at RoboCup 2018 in Montreal. Altogether, this resulted in NimbRo winning the Best Humanoid Award. In particular, we describe our deep-learning approaches for visual perception and our new fully 3D printed robot NimbRo-OP2X.
ROAug 5, 2019
Remote Mobile Manipulation with the Centauro Robot: Full-body Telepresence and Autonomous Operator AssistanceTobias Klamt, Max Schwarz, Christian Lenz et al.
Solving mobile manipulation tasks in inaccessible and dangerous environments is an important application of robots to support humans. Example domains are construction and maintenance of manned and unmanned stations on the moon and other planets. Suitable platforms require flexible and robust hardware, a locomotion approach that allows for navigating a wide variety of terrains, dexterous manipulation capabilities, and respective user interfaces. We present the CENTAURO system which has been designed for these requirements and consists of the Centauro robot and a set of advanced operator interfaces with complementary strength enabling the system to solve a wide range of realistic mobile manipulation tasks. The robot possesses a centaur-like body plan and is driven by torque-controlled compliant actuators. Four articulated legs ending in steerable wheels allow for omnidirectional driving as well as for making steps. An anthropomorphic upper body with two arms ending in five-finger hands enables human-like manipulation. The robot perceives its environment through a suite of multimodal sensors. The resulting platform complexity goes beyond the complexity of most known systems which puts the focus on a suitable operator interface. An operator can control the robot through a telepresence suit, which allows for flexibly solving a large variety of mobile manipulation tasks. Locomotion and manipulation functionalities on different levels of autonomy support the operation. The proposed user interfaces enable solving a wide variety of tasks without previous task-specific training. The integrated system is evaluated in numerous teleoperated experiments that are described along with lessons learned.
RONov 21, 2018
Autonomous Dual-Arm Manipulation of Familiar ObjectsDmytro Pavlichenko, Diego Rodriguez, Max Schwarz et al.
Autonomous dual-arm manipulation is an essential skill to deploy robots in unstructured scenarios. However, this is a challenging undertaking, particularly in terms of perception and planning. Unstructured scenarios are full of objects with different shapes and appearances that have to be grasped in a very specific manner so they can be functionally used. In this paper we present an integrated approach to perform dual-arm pick tasks autonomously. Our method consists of semantic segmentation, object pose estimation, deformable model registration, grasp planning and arm trajectory optimization. The entire pipeline can be executed on-board and is suitable for on-line grasping scenarios. For this, our approach makes use of accumulated knowledge expressed as convolutional neural network models and low-dimensional latent shape spaces. For manipulating objects, we propose a stochastic trajectory optimization that includes a kinematic chain closure constraint. Evaluation in simulation and on the real robot corroborates the feasibility and applicability of the proposed methods on a task of picking up unknown watering cans and drills using both arms.
ROOct 18, 2018
Learning Postural Synergies for Categorical Grasping through Shape Space RegistrationDiego Rodriguez, Antonio Di Guardo, Antonio Frisoli et al.
Every time a person encounters an object with a given degree of familiarity, he/she immediately knows how to grasp it. Adaptation of the movement of the hand according to the object geometry happens effortlessly because of the accumulated knowledge of previous experiences grasping similar objects. In this paper, we present a novel method for inferring grasp configurations based on the object shape. Grasping knowledge is gathered in a synergy space of the robotic hand built by following a human grasping taxonomy. The synergy space is constructed through human demonstrations employing a exoskeleton that provides force feedback, which provides the advantage of evaluating the quality of the grasp. The shape descriptor is obtained by means of a categorical non-rigid registration that encodes typical intra-class variations. This approach is especially suitable for on-line scenarios where only a portion of the object's surface is observable. This method is demonstrated through simulation and real robot experiments by grasping objects never seen before by the robot.
ROOct 6, 2018
Team NimbRo at MBZIRC 2017: Autonomous Valve Stem Turning using a WrenchMax Schwarz, David Droeschel, Christian Lenz et al.
The Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2017 has defined ambitious new benchmarks to advance the state-of-the-art in autonomous operation of ground-based and flying robots. In this article, we describe our winning entry to MBZIRC Challenge 2: the mobile manipulation robot Mario. It is capable of autonomously solving a valve manipulation task using a wrench tool detected, grasped, and finally employed to turn a valve stem. Mario's omnidirectional base allows both fast locomotion and precise close approach to the manipulation panel. We describe an efficient detector for medium-sized objects in 3D laser scans and apply it to detect the manipulation panel. An object detection architecture based on deep neural networks is used to find and select the correct tool from grayscale images. Parametrized motion primitives are adapted online to percepts of the tool and valve stem in order to turn the stem. We report in detail on our winning performance at the challenge and discuss lessons learned.
ROSep 18, 2018
Supervised Autonomous Locomotion and Manipulation for Disaster Response with a Centaur-like RobotTobias Klamt, Diego Rodriguez, Max Schwarz et al.
Mobile manipulation tasks are one of the key challenges in the field of search and rescue (SAR) robotics requiring robots with flexible locomotion and manipulation abilities. Since the tasks are mostly unknown in advance, the robot has to adapt to a wide variety of terrains and workspaces during a mission. The centaur-like robot Centauro has a hybrid legged-wheeled base and an anthropomorphic upper body to carry out complex tasks in environments too dangerous for humans. Due to its high number of degrees of freedom, controlling the robot with direct teleoperation approaches is challenging and exhausting. Supervised autonomy approaches are promising to increase quality and speed of control while keeping the flexibility to solve unknown tasks. We developed a set of operator assistance functionalities with different levels of autonomy to control the robot for challenging locomotion and manipulation tasks. The integrated system was evaluated in disaster response scenarios and showed promising performance.
ROSep 14, 2018
Transferring Category-based Functional Grasping Skills by Latent Space Non-Rigid RegistrationDiego Rodriguez, Sven Behnke
Objects within a category are often similar in their shape and usage. When we---as humans---want to grasp something, we transfer our knowledge from past experiences and adapt it to novel objects. In this paper, we propose a new approach for transferring grasping skills that accumulates grasping knowledge into a category-level canonical model. Grasping motions for novel instances of the category are inferred from geometric deformations between the observed instance and the canonical shape. Correspondences between the shapes are established by means of a non-rigid registration method that combines the Coherent Point Drift approach with subspace methods. By incorporating category-level information into the registration, we avoid unlikely shapes and focus on deformations actually observed within the category. Control poses for generating grasping motions are accumulated in the canonical model from grasping definitions of known objects. According to the estimated shape parameters of a novel instance, the control poses are transformed towards it. The category-level model makes our method particularly relevant for on-line grasping, where fully-observed objects are not easily available. This is demonstrated through experiments in which objects with occluded handles are successfully grasped.
ROSep 14, 2018
Combining Simulations and Real-robot Experiments for Bayesian Optimization of Bipedal Gait StabilizationDiego Rodriguez, André Brandenburger, Sven Behnke
Walking controllers often require parametrization which must be tuned according to some cost function. To estimate these parameters, simulations can be performed which are cheap but do not fully represent reality. Real-robot experiments, on the other hand, are more expensive and lead to hardware wear-off. In this paper, we propose an approach for combining simulations and real experiments to learn gait stabilization parameters. We use a Bayesian optimization method which selects the most informative points in parameter space to evaluate based on the entropy of the cost function to optimize. Experiments with the igus Humanoid Open Platform demonstrate the effectiveness of our approach.
ROSep 14, 2018
Advanced Soccer Skills and Team Play of RoboCup 2017 TeenSize Winner NimbRoDiego Rodriguez, Hafez Farazi, Philipp Allgeuer et al.
In order to pursue the vision of the RoboCup Humanoid League of beating the soccer world champion by 2050, new rules and competitions are added or modified each year fostering novel technological advances. In 2017, the number of players in the TeenSize class soccer games was increase to 3 vs. 3, which allowed for more team play strategies. Improvements in individual skills were also demanded through a set of technical challenges. This paper presents the latest individual skills and team play developments used in RoboCup 2017 that lead our team Nimbro winning the 2017 TeenSize soccer tournament, the technical challenges, and the drop-in games.
ROSep 14, 2018
Transferring Grasping Skills to Novel Instances by Latent Space Non-Rigid RegistrationDiego Rodriguez, Corbin Cogswell, Seongyong Koo et al.
Robots acting in open environments need to be able to handle novel objects. Based on the observation that objects within a category are often similar in their shapes and usage, we propose an approach for transferring grasping skills from known instances to novel instances of an object category. Correspondences between the instances are established by means of a non-rigid registration method that combines the Coherent Point Drift approach with subspace methods. The known object instances are modeled using a canonical shape and a transformation which deforms it to match the instance shape. The principle axes of variation of these deformations define a low-dimensional latent space. New instances can be generated through interpolation and extrapolation in this shape space. For inferring the shape parameters of an unknown instance, an energy function expressed in terms of the latent variables is minimized. Due to the class-level knowledge of the object, our method is able to complete novel shapes from partial views. Control poses for generating grasping motions are transferred efficiently to novel instances by the estimated non-rigid transformation.
ROSep 13, 2018
Grown-up NimbRo Robots Winning RoboCup 2017 Humanoid AdultSize Soccer CompetitionsGrzegorz Ficht, Dmytro Pavlichenko, Philipp Allgeuer et al.
The ongoing evolution of the RoboCup Humanoid League led in 2017 to the introduction of one vs. one soccer games for the AdultSize robots, which motived our team NimbRo to enter this category. In this paper, we present the mechatronic design of our upgraded robot Copedo and the newly developed NimbRo-OP2, which received the RoboCup Design Award. We also describe improved approaches to visual perception of the game situation, including compassless localization on a soccer field with symmetric appearance, and the generation of soccer behaviors. At RoboCup 2017 in Nagoya, our robots played very well, winning the AdultSize soccer tournament with high scores. Our robots also won the technical challenges and we present the developed solutions.