Javier Ruiz-del-Solar

RO
h-index20
14papers
182citations
Novelty35%
AI Score43

14 Papers

ROJun 3
Sem-NaVAE: Semantically-Guided Outdoor Mapless Navigation via Generative Trajectory Priors

Gonzalo Olguín, Javier Ruiz-del-Solar

This work presents a mapless navigation approach for outdoor applications. It combines the exploratory capacity of conditional variational autoencoders (CVAEs) to generate trajectories and the semantic segmentation capabilities of a lightweight visual language model (VLM) to select the trajectory to execute. Open-vocabulary segmentation is used to score and select the generated trajectories based on natural language, and a state-of-the-art local planner executes velocity commands. One of the key features of the proposed approach is its ability to generate a large variability of trajectories and select them to navigate in real-time. In real-world outdoor experiments, Sem-NaVAE achieves a 90% success rate across routes of 120-240m in unseen environments, outperforming the nearest baseline by 10% while remaining within 7% of a map-based upper bound. A video showing an experimental run of the system can be found in https://youtu.be/i3R5ey5O2yk.

ROJun 20, 2017Code
The NAO Backpack: An Open-hardware Add-on for Fast Software Development with the NAO Robot

Matías Mattamala, Gonzalo Olave, Clayder González et al.

We present an open-source accessory for the NAO robot, which enables to test computationally demanding algorithms in an external platform while preserving robot's autonomy and mobility. The platform has the form of a backpack, which can be 3D printed and replicated, and holds an ODROID XU4 board to process algorithms externally with ROS compatibility. We provide also a software bridge between the B-Human's framework and ROS to have access to the robot's sensors close to real-time. We tested the platform in several robotics applications such as data logging, visual SLAM, and robot vision with deep learning techniques. The CAD model, hardware specifications and software are available online for the benefit of the community: https://github.com/uchile-robotics/nao-backpack

CVJun 12, 2025
Human-Robot Navigation using Event-based Cameras and Reinforcement Learning

Ignacio Bugueno-Cordova, Javier Ruiz-del-Solar, Rodrigo Verschae

This work introduces a robot navigation controller that combines event cameras and other sensors with reinforcement learning to enable real-time human-centered navigation and obstacle avoidance. Unlike conventional image-based controllers, which operate at fixed rates and suffer from motion blur and latency, this approach leverages the asynchronous nature of event cameras to process visual information over flexible time intervals, enabling adaptive inference and control. The framework integrates event-based perception, additional range sensing, and policy optimization via Deep Deterministic Policy Gradient, with an initial imitation learning phase to improve sample efficiency. Promising results are achieved in simulated environments, demonstrating robust navigation, pedestrian following, and obstacle avoidance. A demo video is available at the project website.

LGMay 23, 2025
Diffusion Self-Weighted Guidance for Offline Reinforcement Learning

Augusto Tagle, Javier Ruiz-del-Solar, Felipe Tobar

Offline reinforcement learning (RL) recovers the optimal policy $π$ given historical observations of an agent. In practice, $π$ is modeled as a weighted version of the agent's behavior policy $μ$, using a weight function $w$ working as a critic of the agent's behavior. Though recent approaches to offline RL based on diffusion models have exhibited promising results, the computation of the required scores is challenging due to their dependence on the unknown $w$. In this work, we alleviate this issue by constructing a diffusion over both the actions and the weights. With the proposed setting, the required scores are directly obtained from the diffusion model without learning extra networks. Our main conceptual contribution is a novel guidance method, where guidance (which is a function of $w$) comes from the same diffusion model, therefore, our proposal is termed Self-Weighted Guidance (SWG). We show that SWG generates samples from the desired distribution on toy examples and performs on par with state-of-the-art methods on D4RL's challenging environments, while maintaining a streamlined training pipeline. We further validate SWG through ablation studies on weight formulations and scalability.

LGMar 9, 2021
Learning to Play Soccer From Scratch: Sample-Efficient Emergent Coordination through Curriculum-Learning and Competition

Pavan Samtani, Francisco Leiva, Javier Ruiz-del-Solar

This work proposes a scheme that allows learning complex multi-agent behaviors in a sample efficient manner, applied to 2v2 soccer. The problem is formulated as a Markov game, and solved using deep reinforcement learning. We propose a basic multi-agent extension of TD3 for learning the policy of each player, in a decentralized manner. To ease learning, the task of 2v2 soccer is divided in three stages: 1v0, 1v1 and 2v2. The process of learning in multi-agent stages (1v1 and 2v2) uses agents trained on a previous stage as fixed opponents. In addition, we propose using experience sharing, a method that shares experience from a fixed opponent, trained in a previous stage, for training the agent currently learning, and a form of frame-skipping, to raise performance significantly. Our results show that high quality soccer play can be obtained with our approach in just under 40M interactions. A summarized video of the resulting game play can be found in https://youtu.be/f25l1j1U9RM.

ROAug 14, 2019
Continuous Control for High-Dimensional State Spaces: An Interactive Learning Approach

Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar et al.

Deep Reinforcement Learning (DRL) has become a powerful methodology to solve complex decision-making problems. However, DRL has several limitations when used in real-world problems (e.g., robotics applications). For instance, long training times are required and cannot be accelerated in contrast to simulated environments, and reward functions may be hard to specify/model and/or to compute. Moreover, the transfer of policies learned in a simulator to the real-world has limitations (reality gap). On the other hand, machine learning methods that rely on the transfer of human knowledge to an agent have shown to be time efficient for obtaining well performing policies and do not require a reward function. In this context, we analyze the use of human corrective feedback during task execution to learn policies with high-dimensional state spaces, by using the D-COACH framework, and we propose new variants of this framework. D-COACH is a Deep Learning based extension of COACH (COrrective Advice Communicated by Humans), where humans are able to shape policies through corrective advice. The enhanced version of D-COACH, which is proposed in this paper, largely reduces the time and effort of a human for training a policy. Experimental results validate the efficiency of the D-COACH framework in three different problems (simulated and with real robots), and show that its enhanced version reduces the human training effort considerably, and makes it feasible to learn policies within periods of time in which a DRL agent do not reach any improvement.

CVNov 29, 2018
Playing Soccer without Colors in the SPL: A Convolutional Neural Network Approach

Francisco Leiva, Nicolás Cruz, Ignacio Bugueño et al.

The goal of this paper is to propose a vision system for humanoid robotic soccer that does not use any color information. The main features of this system are: (i) real-time operation in the NAO robot, and (ii) the ability to detect the ball, the robots, their orientations, the lines and key field features robustly. Our ball detector, robot detector, and robot's orientation detector obtain the highest reported detection rates. The proposed vision system is tested in a SPL field with several NAO robots under realistic and highly demanding conditions. The obtained results are: robot detection rate of 94.90%, ball detection rate of 97.10%, and a completely perceived orientation rate of 99.88% when the observed robot is static, and 95.52% when the observed robot is moving.

RONov 20, 2018
Visual SLAM-based Localization and Navigation for Service Robots: The Pepper Case

Cristopher Gómez, Matías Mattamala, Tim Resink et al.

We propose a Visual-SLAM based localization and navigation system for service robots. Our system is built on top of the ORB-SLAM monocular system but extended by the inclusion of wheel odometry in the estimation procedures. As a case study, the proposed system is validated using the Pepper robot, whose short-range LIDARs and RGB-D camera do not allow the robot to self-localize in large environments. The localization system is tested in navigation tasks using Pepper in two different environments: a medium-size laboratory, and a large-size hall.

RONov 20, 2018
Near Real-Time Object Recognition for Pepper based on Deep Neural Networks Running on a Backpack

Esteban Reyes, Cristopher Gómez, Esteban Norambuena et al.

The main goal of the paper is to provide Pepper with a near real-time object recognition system based on deep neural networks. The proposed system is based on YOLO (You Only Look Once), a deep neural network that is able to detect and recognize objects robustly and at a high speed. In addition, considering that YOLO cannot be run in the Pepper's internal computer in near real-time, we propose to use a Backpack for Pepper, which holds a Jetson TK1 card and a battery. By using this card, Pepper is able to robustly detect and recognize objects in images of 320x320 pixels at about 5 frames per second.

LGSep 30, 2018
Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar et al.

Deep Reinforcement Learning (DRL) has become a powerful strategy to solve complex decision making problems based on Deep Neural Networks (DNNs). However, it is highly data demanding, so unfeasible in physical systems for most applications. In this work, we approach an alternative Interactive Machine Learning (IML) strategy for training DNN policies based on human corrective feedback, with a method called Deep COACH (D-COACH). This approach not only takes advantage of the knowledge and insights of human teachers as well as the power of DNNs, but also has no need of a reward function (which sometimes implies the need of external perception for computing rewards). We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent's actions during execution. The D-COACH framework has the potential to solve complex problems without much data or time required. Experimental results validated the efficiency of the framework in three different problems (two simulated, one with a real robot), with state spaces of low and high dimensions, showing the capacity to successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.

CVMar 28, 2018
A Survey on Deep Learning Methods for Robot Vision

Javier Ruiz-del-Solar, Patricio Loncomilla, Naiomi Soto

Deep learning has allowed a paradigm shift in pattern recognition, from using hand-crafted features together with statistical classifiers to using general-purpose learning procedures for learning data-driven representations, features, and classifiers together. The application of this new paradigm has been particularly successful in computer vision, in which the development of deep learning methods for vision applications has become a hot research topic. Given that deep learning has already attracted the attention of the robot vision community, the main purpose of this survey is to address the use of deep learning in robot vision. To achieve this, a comprehensive overview of deep learning and its usage in computer vision is given, that includes a description of the most frequently used neural models and their main application areas. Then, the standard methodology and tools used for designing deep-learning based vision systems are presented. Afterwards, a review of the principal work using deep learning in robot vision is presented, as well as current and future trends related to the use of deep learning in robotics. This survey is intended to be a guide for the developers of robot vision systems.

CVJun 20, 2017
Using Convolutional Neural Networks in Robots with Limited Computational Resources: Detecting NAO Robots while Playing Soccer

Nicolás Cruz, Kenzo Lobos-Tsunekawa, Javier Ruiz-del-Solar

The main goal of this paper is to analyze the general problem of using Convolutional Neural Networks (CNNs) in robots with limited computational capabilities, and to propose general design guidelines for their use. In addition, two different CNN based NAO robot detectors that are able to run in real-time while playing soccer are proposed. One of the detectors is based on the XNOR-Net and the other on the SqueezeNet. Each detector is able to process a robot object-proposal in ~1ms, with an average number of 1.5 proposals per frame obtained by the upper camera of the NAO. The obtained detection rate is ~97%.

ROJun 20, 2017
Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions

Kenzo Lobos-Tsunekawa, David L. Leottau, Javier Ruiz-del-Solar

This paper addresses the design and implementation of complex Reinforcement Learning (RL) behaviors where multi-dimensional action spaces are involved, as well as the need to execute the behaviors in real-time using robotic platforms with limited computational resources and training times. For this purpose, we propose the use of decentralized RL, in combination with finite support basis functions as alternatives to Gaussian RBF, in order to alleviate the effects of the curse of dimensionality on the action and state spaces respectively, and to reduce the computation time. As testbed, a RL based controller for the in-walk kick in NAO robots, a challenging and critical problem for soccer robotics, is used. The reported experiments show empirically that our solution saves up to 99.94% of execution time and 98.82% of memory consumption during execution, without diminishing performance compared to classical approaches.

CVJun 20, 2017
Recognition of Grasp Points for Clothes Manipulation under unconstrained Conditions

Luz María Martínez, Javier Ruiz-del-Solar

In this work a system for recognizing grasp points in RGB-D images is proposed. This system is intended to be used by a domestic robot when deploying clothes lying at a random position on a table. By taking into consideration that the grasp points are usually near key parts of clothing, such as the waist of pants or the neck of a shirt. The proposed system attempts to detect these key parts first, using a local multivariate contour that adapts its shape accordingly. Then, the proposed system applies the Vessel Enhancement filter to identify wrinkles in the clothes, allowing to compute a roughness index for the clothes. Finally, by mixing (i) the key part contours and (ii) the roughness information obtained by the vessel filter, the system is able to recognize grasp points for unfolding a piece of clothing. The recognition system is validated using realistic RGB-D images of different cloth types.