Javier Gonzalez-Jimenez

9papers

772citations

Novelty51%

AI Score31

Ranked #141,217 of 201,326 authors (top 70%)#44,168 in CV (top 75%)

9 Papers

ROMar 5, 2021Code

An Analytical Solution to the IMU Initialization Problem for Visual-Inertial Systems

David Zuñiga-Noël, Francisco-Angel Moreno, Javier Gonzalez-Jimenez

The fusion of visual and inertial measurements is becoming more and more popular in the robotics community since both sources of information complement well each other. However, in order to perform this fusion, the biases of the Inertial Measurement Unit (IMU) as well as the direction of gravity must be initialized first. Additionally, in case of a monocular camera, the metric scale is also needed. The most popular visual-inertial initialization approaches rely on accurate vision-only motion estimates to build a non-linear optimization problem that solves for these parameters in an iterative way. In this paper, we rely on the previous work in [1] and propose an analytical solution to estimate the accelerometer bias, the direction of gravity and the scale factor in a maximum-likelihood framework. This formulation results in a very efficient estimation approach and, due to the non-iterative nature of the solution, avoids the intrinsic issues of previous iterative solutions. We present an extensive validation of the proposed IMU initialization approach and a performance comparison against the state-of-the-art approach described in [2] with real data from the publicly available EuRoC dataset, achieving comparable accuracy at a fraction of its computational cost and without requiring an initial guess for the scale factor. We also provide a C++ open source reference implementation.

CVJan 21, 2021Code

Fast and Robust Certifiable Estimation of the Relative Pose Between Two Calibrated Cameras

Mercedes Garcia-Salguero, Javier Gonzalez-Jimenez

This work contributes an efficient algorithm to compute the Relative Pose problem (RPp) between calibrated cameras and certify the optimality of the solution, given a set of pair-wise feature correspondences affected by noise and probably corrupted by wrong matches. We propose a family of certifiers that is shown to increase the ratio of detected optimal solutions. This set of certifiers is incorporated into a fast essential matrix estimation pipeline that, given any initial guess for the RPp, refines it iteratively on the product space of 3D rotations and 2-sphere. In addition, this fast certifiable pipeline is integrated into a robust framework that combines Graduated Non-convexity and the Black-Rangarajan duality between robust functions and line processes. We proved through extensive experiments on synthetic and real data that the proposed framework provides a fast and robust relative pose estimation. We make the code publicly available \url{https://github.com/mergarsal/FastCertRelPose.git}.

ROJul 3, 2019Code

Intrinsic Calibration of Depth Cameras for Mobile Robots using a Radial Laser Scanner

David Zuñiga-Noël, Jose-Raul Ruiz-Sarmiento, Javier Gonzalez-Jimenez

Depth cameras, typically in RGB-D configurations, are common devices in mobile robotic platforms given their appealing features: high frequency and resolution, low price and power requirements, among others. These sensors may come with significant, non-linear errors in the depth measurements that jeopardize robot tasks, like free-space detection, environment reconstruction or visual robot-human interaction. This paper presents a method to calibrate such systematic errors with the help of a second, more precise range sensor, in our case a radial laser scanner. In contrast to what it may seem at first, this does not mean a serious limitation in practice since these two sensors are often mounted jointly in many mobile robotic platforms, as they complement well each other. Moreover, the laser scanner can be used just for the calibration process and get rid of it after that. The main contributions of the paper are: i) the calibration is formulated from a probabilistic perspective through a Maximum Likelihood Estimation problem, and ii) the proposed method can be easily executed automatically by mobile robotic platforms. To validate the proposed approach we evaluated for both, local distortion of 3D planar reconstructions and global shifts in the measurements, obtaining considerably more accurate results. A C++ open-source implementation of the presented method has been released for the benefit of the community.

CVMay 26, 2017Code

PL-SLAM: a Stereo SLAM System through the Combination of Points and Line Segments

Ruben Gomez-Ojeda, David Zuñiga-Noël, Francisco-Angel Moreno et al.

Traditional approaches to stereo visual SLAM rely on point features to estimate the camera trajectory and build a map of the environment. In low-textured environments, though, it is often difficult to find a sufficient number of reliable point features and, as a consequence, the performance of such algorithms degrades. This paper proposes PL-SLAM, a stereo visual SLAM system that combines both points and line segments to work robustly in a wider variety of scenarios, particularly in those where point features are scarce or not well-distributed in the image. PL-SLAM leverages both points and segments at all the instances of the process: visual odometry, keyframe selection, bundle adjustment, etc. We contribute also with a loop closure procedure through a novel bag-of-words approach that exploits the combined descriptive power of the two kinds of features. Additionally, the resulting map is richer and more diverse in 3D elements, which can be exploited to infer valuable, high-level scene structures like planes, empty spaces, ground plane, etc. (not addressed in this work). Our proposal has been tested with several popular datasets (such as KITTI and EuRoC), and is compared to state of the art methods like ORB-SLAM, revealing a more robust performance in most of the experiments, while still running in real-time. An open source version of the PL-SLAM C++ code will be released for the benefit of the community.

CVMar 30, 2020

Certifiable Relative Pose Estimation

Mercedes Garcia-Salguero, Jesus Briales, Javier Gonzalez-Jimenez

In this paper we present the first fast optimality certifier for the non-minimal version of the Relative Pose problem for calibrated cameras from epipolar constraints. The proposed certifier is based on Lagrangian duality and relies on a novel closed-form expression for dual points. We also leverage an efficient solver that performs local optimization on the manifold of the original problem's non-convex domain. The optimality of the solution is then checked via our novel fast certifier. The extensive conducted experiments demonstrate that, despite its simplicity, this certifiable solver performs excellently on synthetic data, repeatedly attaining the (certified \textit{a posteriori}) optimal solution and shows a satisfactory performance on real data.

ROJun 11, 2019

Automatic Multi-Sensor Extrinsic Calibration for Mobile Robots

David Zuñiga-Noël, Jose-Raul Ruiz-Sarmiento, Ruben Gomez-Ojeda et al.

In order to fuse measurements from multiple sensors mounted on a mobile robot, it is needed to express them in a common reference system through their relative spatial transformations. In this paper, we present a method to estimate the full 6DoF extrinsic calibration parameters of multiple heterogeneous sensors (Lidars, Depth and RGB cameras) suitable for automatic execution on a mobile robot. Our method computes the 2D calibration parameters (x, y, yaw) through a motion-based approach, while for the remaining 3 parameters (z, pitch, roll) it requires the observation of the ground plane for a short period of time. What set this proposal apart from others is that: i) all calibration parameters are initialized in closed form, and ii) the scale ambiguity inherent to motion estimation from a monocular camera is explicitly handled, enabling the combination of these sensors and metric ones (Lidars, stereo rigs, etc.) within the same optimization framework. %Additionally, outlier observations arising from local sensor drift are automatically detected and removed from the calibration process. We provide a formal definition of the problem, as well as of the contributed method, for which a C++ implementation has been made publicly available. The suitability of the method has been assessed in simulation an with real data from indoor and outdoor scenarios. Finally, improvements over state-of-the-art motion-based calibration proposals are shown through experimental evaluation.

CVSep 25, 2018

Geometric-based Line Segment Tracking for HDR Stereo Sequences

Ruben Gomez-Ojeda, Javier Gonzalez-Jimenez

In this work, we propose a purely geometrical approach for the robust matching of line segments for challenging stereo streams with severe illumination changes or High Dynamic Range (HDR) environments. To that purpose, we exploit the univocal nature of the matching problem, i.e. every observation must be corresponded with a single feature or not corresponded at all. We state the problem as a sparse, convex, L1-minimization of the matching vector regularized by the geometric constraints. This formulation allows for the robust tracking of line segments along sequences where traditional appearance-based matching techniques tend to fail due to dynamic changes in illumination conditions. Moreover, the proposed matching algorithm also results in a considerable speed-up of previous state of the art techniques making it suitable for real-time applications such as Visual Odometry (VO). This, of course, comes at expense of a slightly lower number of matches in comparison with appearance based methods, and also limits its application to continuous video sequences, as it is rather constrained to small pose increments between consecutive frames. We validate the claimed advantages by first evaluating the matching performance in challenging video sequences, and then testing the method in a benchmarked point and line based VO algorithm.

CVJul 5, 2017

Learning-based Image Enhancement for Visual Odometry in Challenging HDR Environments

Ruben Gomez-Ojeda, Zichao Zhang, Javier Gonzalez-Jimenez et al.

One of the main open challenges in visual odometry (VO) is the robustness to difficult illumination conditions or high dynamic range (HDR) environments. The main difficulties in these situations come from both the limitations of the sensors and the inability to perform a successful tracking of interest points because of the bold assumptions in VO, such as brightness constancy. We address this problem from a deep learning perspective, for which we first fine-tune a Deep Neural Network (DNN) with the purpose of obtaining enhanced representations of the sequences for VO. Then, we demonstrate how the insertion of Long Short Term Memory (LSTM) allows us to obtain temporally consistent sequences, as the estimation depends on previous states. However, the use of very deep networks does not allow the insertion into a real-time VO framework; therefore, we also propose a Convolutional Neural Network (CNN) of reduced size capable of performing faster. Finally, we validate the enhanced representations by evaluating the sequences produced by the two architectures in several state-of-art VO algorithms, such as ORB-SLAM and DSO.

CVMay 27, 2015

Training a Convolutional Neural Network for Appearance-Invariant Place Recognition

Ruben Gomez-Ojeda, Manuel Lopez-Antequera, Nicolai Petkov et al.

Place recognition is one of the most challenging problems in computer vision, and has become a key part in mobile robotics and autonomous driving applications for performing loop closure in visual SLAM systems. Moreover, the difficulty of recognizing a revisited location increases with appearance changes caused, for instance, by weather or illumination variations, which hinders the long-term application of such algorithms in real environments. In this paper we present a convolutional neural network (CNN), trained for the first time with the purpose of recognizing revisited locations under severe appearance changes, which maps images to a low dimensional space where Euclidean distances represent place dissimilarity. In order for the network to learn the desired invariances, we train it with triplets of images selected from datasets which present a challenging variability in visual appearance. The triplets are selected in such way that two samples are from the same location and the third one is taken from a different place. We validate our system through extensive experimentation, where we demonstrate better performance than state-of-art algorithms in a number of popular datasets.