ROAug 1, 2024
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition DatasetsJian Li, Bowen Xu, Sören Schwertfeger
Robotic datasets are important for scientific benchmarking and developing algorithms, for example for Simultaneous Localization and Mapping (SLAM). Modern robotic datasets feature video data of high resolution and high framerates. Storing and sharing those datasets becomes thus very costly, especially if more than one camera is used for the datasets. It is thus essential to store this video data in a compressed format. This paper investigates the use of modern video encoders for robotic datasets. We provide a software that can replay mp4 videos within ROS 1 and ROS 2 frameworks, supporting the synchronized playback in simulated time. Furthermore, the paper evaluates different encoders and their settings to find optimal configurations in terms of resulting size, quality and encoding time. Through this work we show that it is possible to store and share even highest quality video datasets within reasonable storage constraints.
CVAug 5, 2021Code
Video Contrastive Learning with Global ContextHaofei Kuang, Yi Zhu, Zhi Zhang et al.
Contrastive learning has revolutionized self-supervised image representation learning field, and recently been adapted to video domain. One of the greatest advantages of contrastive learning is that it allows us to flexibly define powerful loss objectives as long as we can find a reasonable way to formulate positive and negative samples to contrast. However, existing approaches rely heavily on the short-range spatiotemporal salience to form clip-level contrastive signals, thus limit themselves from using global context. In this paper, we propose a new video-level contrastive learning method based on segments to formulate positive pairs. Our formulation is able to capture global context in a video, thus robust to temporal content change. We also incorporate a temporal order regularization term to enforce the inherent sequential structure of videos. Extensive experiments show that our video-level contrastive learning framework (VCLR) is able to outperform previous state-of-the-arts on five video datasets for downstream action classification, action localization and video retrieval. Code is available at https://github.com/amazon-research/video-contrastive-learning.
CVNov 26, 2025
From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial SettingsJiajie Zhang, Sören Schwertfeger, Alexander Kleiner
We present a novel unsupervised framework to unlock vast unlabeled human demonstration data from continuous industrial video streams for Vision-Language-Action (VLA) model pre-training. Our method first trains a lightweight motion tokenizer to encode motion dynamics, then employs an unsupervised action segmenter leveraging a novel "Latent Action Energy" metric to discover and segment semantically coherent action primitives. The pipeline outputs both segmented video clips and their corresponding latent action sequences, providing structured data directly suitable for VLA pre-training. Evaluations on public benchmarks and a proprietary electric motor assembly dataset demonstrate effective segmentation of key tasks performed by humans at workstations. Further clustering and quantitative assessment via a Vision-Language Model confirm the semantic coherence of the discovered action primitives. To our knowledge, this is the first fully automated end-to-end system for extracting and organizing VLA pre-training data from unstructured industrial videos, offering a scalable solution for embodied AI integration in manufacturing.
ROFeb 21, 2024
RealDex: Towards Human-like Grasping for Robotic Dexterous HandYumeng Liu, Yaxun Yang, Youzhuo Wang et al.
In this paper, we introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns, enriched by multi-view and multimodal visual data. Utilizing a teleoperation system, we seamlessly synchronize human-robot hand poses in real time. This collection of human-like motions is crucial for training dexterous hands to mimic human movements more naturally and precisely. RealDex holds immense promise in advancing humanoid robot for automated perception, cognition, and manipulation in real-world scenarios. Moreover, we introduce a cutting-edge dexterous grasping motion generation framework, which aligns with human experience and enhances real-world applicability through effectively utilizing Multimodal Large Language Models. Extensive experiments have demonstrated the superior performance of our method on RealDex and other open datasets. The complete dataset and code will be made available upon the publication of this work.
ROMar 11, 2024
3DRef: 3D Dataset and Benchmark for Reflection Detection in RGB and Lidar DataXiting Zhao, Sören Schwertfeger
Reflective surfaces present a persistent challenge for reliable 3D mapping and perception in robotics and autonomous systems. However, existing reflection datasets and benchmarks remain limited to sparse 2D data. This paper introduces the first large-scale 3D reflection detection dataset containing more than 50,000 aligned samples of multi-return Lidar, RGB images, and 2D/3D semantic labels across diverse indoor environments with various reflections. Textured 3D ground truth meshes enable automatic point cloud labeling to provide precise ground truth annotations. Detailed benchmarks evaluate three Lidar point cloud segmentation methods, as well as current state-of-the-art image segmentation networks for glass and mirror detection. The proposed dataset advances reflection detection by providing a comprehensive testbed with precise global alignment, multi-modal data, and diverse reflective objects and materials. It will drive future research towards reliable reflection detection. The dataset is publicly available at http://3dref.github.io
ROFeb 2, 2022
Accurate calibration of multi-perspective cameras from a generalization of the hand-eye constraintYifu Wang, Wenqing Jiang, Kun Huang et al.
Multi-perspective cameras are quickly gaining importance in many applications such as smart vehicles and virtual or augmented reality. However, a large system size or absence of overlap in neighbouring fields-of-view often complicate their calibration. We present a novel solution which relies on the availability of an external motion capture system. Our core contribution consists of an extension to the hand-eye calibration problem which jointly solves multi-eye-to-base problems in closed form. We furthermore demonstrate its equivalence to the multi-eye-in-hand problem. The practical validity of our approach is supported by our experiments, indicating that the method is highly efficient and accurate, and outperforms existing closed-form alternatives.
RONov 16, 2021
Hierarchical Topometric Representation of 3D Robotic MapsZhenpeng He, Hao Sun, Jiawei Hou et al.
In this paper, we propose a method for generating a hierarchical, volumetric topological map from 3D point clouds. There are three basic hierarchical levels in our map: $storey - region - volume$. The advantages of our method are reflected in both input and output. In terms of input, we accept multi-storey point clouds and building structures with sloping roofs or ceilings. In terms of output, we can generate results with metric information of different dimensionality, that are suitable for different robotics applications. The algorithm generates the volumetric representation by generating $volumes$ from a 3D voxel occupancy map. We then add $passage$s (connections between $volumes$), combine small $volumes$ into a big $region$ and use a 2D segmentation method for better topological representation. We evaluate our method on several freely available datasets. The experiments highlight the advantages of our approach.
RONov 17, 2020
Improved Visual-Inertial Localization for Low-cost Rescue RobotsXiaoling Long, Qingwen Xu, Yijun Yuan et al.
This paper improves visual-inertial systems to boost the localization accuracy for low-cost rescue robots. When robots traverse on rugged terrain, the performance of pose estimation suffers from big noise on the measurements of the inertial sensors due to ground contact forces, especially for low-cost sensors. Therefore, we propose \textit{Threshold}-based and \textit{Dynamic Time Warping}-based methods to detect abnormal measurements and mitigate such faults. The two methods are embedded into the popular VINS-Mono system to evaluate their performance. Experiments are performed on simulation and real robot data, which show that both methods increase the pose estimation accuracy. Moreover, the \textit{Threshold}-based method performs better when the noise is small and the \textit{Dynamic Time Warping}-based one shows greater potential on large noise.
ROJul 23, 2020
Advanced Mapping Robot and High-Resolution DatasetHongyu Chen, Zhijie Yang, Xiting Zhao et al.
This paper presents a fully hardware synchronized mapping robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. Nine high-resolution cameras and two 32-beam 3D Lidars were used along with a professional, static 3D scanner for ground truth map collection. With all the sensors calibrated on the mapping robot, three datasets are collected to evaluate the performance of mapping algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. We provide the datasets for download and the mapping and evaluation procedures are made in a very easily reproducible manner for maximum comparability. We have also conducted a survey on available robotics-related datasets and compiled a big table with those datasets and a number of properties of them.
ROMar 11, 2020
Self-supervised Point Set Local Descriptors for Point Cloud RegistrationYijun Yuan, Jiawei Hou, Andreas Nüchter et al.
In this work, we propose to learn local descriptors for point clouds in a self-supervised manner. In each iteration of the training, the input of the network is merely one unlabeled point cloud. On top of our previous work, that directly solves the transformation between two point sets in one step without correspondences, the proposed method is able to train from one point cloud, by supervising its self-rotation, that we randomly generate. The whole training requires no manual annotation. In several experiments we evaluate the performance of our method on various datasets and compare to other state of the art algorithms. The results show, that our self-supervised learned descriptor achieves equivalent or even better performance than the supervised learned model, while being easier to train and not requiring labeled data.
ROMar 1, 2020
Non-iterative One-step Solution for Point Set Registration Problem on Pose Estimation without CorrespondenceYijun Yuan, Dorit Borrmann, Andreas Nüchter et al.
In this work, we propose to directly find the one-step solution for the point set registration problem without correspondences. Inspired by the Kernel Correlation method, we consider the fully connected objective function between two point sets, thus avoiding the computation of correspondences. By utilizing least square minimization, the transformed objective function is directly solved with existing well-known closed-form solutions, e.g., singular value decomposition, that is usually used for given correspondences. However, using equal weights of costs for each connection will degenerate the solution due to the large influence of distant pairs. Thus, we additionally set a scale on each term to avoid high costs on non-important pairs. As in feature-based registration methods, the similarity between descriptors of points determines the scaling weight. Given the weights, we get a one step solution. As the runtime is in $\mathcal O (n^2)$, we also propose a variant with keypoints that strongly reduces the cost. The experiments show that the proposed method gives a one-step solution without an initial guess. Our method exhibits competitive outlier robustness and accuracy, compared to various other methods, and it is more stable in case of large rotations. Additionally, our one-step solution achieves a performance on-par with the state-of-the-art feature based method TEASER.
RODec 3, 2019
Comparison and Evaluation of 2D and 3D Range SensorsQingwen Xu, Sören Schwertfeger
For mobile robots range sensors are important to perceive the environment. Sensors that can measure in a 3D volume are especially significant for outdoor robotics, because this environment is often highly unstructured. The quality of the data gathered by those sensors influences all algorithms relying on it. In this paper thus the precision of several 2D and 2.5D sensors is measured at different ranges and different incidence angles. The results of all tests are presented and analyzed.
RODec 3, 2019
Room Detection for Topological MapsSören Schwertfeger, Tianyan Yu
Mapping is an important part of many robotic applications. In order to measure the performance of the mapping process we have to measure the quality of its result: the map. The map is essential for robotic algorithms like localization and path planning. Previously it was shown how matched Topology Graphs can be used for map evaluation by comparing the topology of the robot generated map to the topology of a ground truth map. In this paper we are extending the previous work by detecting open areas, for example rooms, in the 2D grid map and adding those to the topological representation. This way we can avoid the unreliable generation of paths in open areas, thus making the Topology Graph generation, and through that also the Topology Graph matching, more stable and robust. The detection applies the alpha shape algorithm for room detection.
RODec 3, 2019
Evaluation of Smartphone IMUs for Small Mobile Search and Rescue RobotsXiangyang Zhi, Qingwen Xu, Sören Schwertfeger
Small mobile robots are an important class of Search and Rescue Robots. Integrating all required components into such small robots is a difficult engineering task. Smartphones have already been made small, lightweight and cheap by the industry and are thus an excellent candidate as main controller for such robots. In this paper we outline how ROS can be used on Android devices and then evaluate one sensor which is very important for mobile robots: the Inertial Measurement Unit (IMU). Experiments are performed under static and dynamic conditions to measure the error of the IMUs of three smartphones and three professional IMUs. In the experiments we make use of a tracking system and an autonomous mobile robot.
RONov 18, 2019
Fast 2D Map Matching Based on Area GraphsJiawei Hou, Haofei Kuang, Sören Schwertfeger
We present a novel area matching algorithm for merging two different 2D grid maps. There are many approaches to address this problem, nevertheless, most previous work is built on some assumptions, such as rigid transformation, or similar scale and modalities of two maps. In this work we propose a 2D map matching algorithm based on area segmentation. We transfer general 2D occupancy grid maps to an area graph representation, then compute the correct results by voting in that space. In the experiments, we compare with a state-of-the-art method applied to the matching of sensor maps with ground truth layout maps. The experiment shows that our algorithm has a better performance on large-scale maps and a faster computation speed.
AINov 15, 2019
Fine-grained Qualitative Spatial Reasoning about Point PositionsSören Schwertfeger
The ability to persist in the spacial environment is, not only in the robotic context, an essential feature. Positional knowledge is one of the most important aspects of space and a number of methods to represent these information have been developed in the in the research area of spatial cognition. The basic qualitative spatial representation and reasoning techniques are presented in this thesis and several calculi are briefly reviewed. Features and applications of qualitative calculi are summarized. A new calculus for representing and reasoning about qualitative spatial orientation and distances is being designed. It supports an arbitrary level of granularity over ternary relations of points. Ways of improving the complexity of the composition are shown and an implementation of the calculus demonstrates its capabilities. Existing qualitative spatial calculi of positional information are compared to the new approach and possibilities for future research are outlined.
RONov 2, 2019
Furniture Free Mapping using 3D LidarsZhenpeng He, Jiawei Hou, Sören Schwertfeger
Mobile robots depend on maps for localization, planning, and other applications. In indoor scenarios, there is often lots of clutter present, such as chairs, tables, other furniture, or plants. While mapping this clutter is important for certain applications, for example navigation, maps that represent just the immobile parts of the environment, i.e. walls, are needed for other applications, like room segmentation or long-term localization. In literature, approaches can be found that use a complete point cloud to remove the furniture in the room and generate a furniture free map. In contrast, we propose a Simultaneous Localization And Mapping (SLAM)-based mobile laser scanning solution. The robot uses an orthogonal pair of Lidars. The horizontal scanner aims to estimate the robot position, whereas the vertical scanner generates the furniture free map. There are three steps in our method: point cloud rearrangement, wall plane detection and semantic labeling. In the experiment, we evaluate the efficiency of removing furniture in a typical indoor environment. We get $99.60\%$ precision in keeping the wall in the 3D result, which shows that our algorithm can remove most of the furniture in the environment. Furthermore, we introduce the application of 2D furniture free mapping for room segmentation.
ROOct 2, 2019
Pose Estimation for Omni-directional Cameras using Sinusoid FittingHaofei Kuang, Qingwen Xu, Xiaoling Long et al.
We propose a novel pose estimation method for geometric vision of omni-directional cameras. On the basis of the regularity of the pixel movement after camera pose changes, we formulate and prove the sinusoidal relationship between pixels movement and camera motion. We use the improved Fourier-Mellin invariant (iFMI) algorithm to find the motion of pixels, which was shown to be more accurate and robust than the feature-based methods. While iFMI works only on pin-hole model images and estimates 4 parameters (x, y, yaw, scaling), our method works on panoramic images and estimates the full 6 DoF 3D transform, up to an unknown scale factor. For that we fit the motion of the pixels in the panoramic images, as determined by iFMI, to two sinusoidal functions. The offsets, amplitudes and phase-shifts of the two functions then represent the 3D rotation and translation of the camera between the two images. We perform experiments for 3D rotation, which show that our algorithm outperforms the feature-based methods in accuracy and robustness. We leave the more complex 3D translation experiments for future work.
ROOct 1, 2019
Area Graph: Generation of Topological Maps using the Voronoi DiagramJiawei Hou, Yijun Yuan, Sören Schwertfeger
Representing a scanned map of the real environment as a topological structure is an important research topic in robotics. Since topological representations of maps save a huge amount of map storage space and online computing time, they are widely used in fields such as path planning, map matching, and semantic mapping. We use a topological map representation, the Area Graph, in which the vertices represent areas and edges represent passages. The Area Graph is developed from a pruned Voronoi Graph, the Topology Graph. We also employ a simple room detection algorithm to compensate the fact that the Voronoi Graph gets unstable in open areas. We claim that our area segmentation method is superior to state-of-the-art approaches in complex indoor environments and support this claim with a number of experiments.
ROSep 27, 2019
Mapping with Reflection -- Detection and Utilization of Reflection in 3D Lidar ScansXiting Zhao, Zhijie Yang, Sören Schwertfeger
This paper presents a method to detect reflection of 3D light detection and ranging (Lidar) scans and uses it to classify the points and also map objects outside the line of sight. Our software uses several approaches to analyze the point cloud, including intensity peak detection, dual return detection, plane fitting, and finding the boundaries. These approaches can classify the point cloud and detect the reflection in it. By mirroring the reflection points on the detected window pane and adding classification labels on the points, we can improve the map quality in a Simultaneous Localization and Mapping (SLAM) framework. Experiments using real scan data and ground truth data showcase the effectiveness of our method.
ROSep 23, 2019
Improving CNN-based Planar Object Detection with Geometric Prior KnowledgeJianxiong Cai, Jiawei Hou, Yiren Lu et al.
In this paper, we focus on the question: how might mobile robots take advantage of affordable RGB-D sensors for object detection? Although current CNN-based object detectors have achieved impressive results, there are three main drawbacks for practical usage on mobile robots: 1) It is hard and time-consuming to collect and annotate large-scale training sets. 2) It usually needs a long training time. 3) CNN-based object detection shows significant weakness in predicting location. We propose an improved method for the detection of planar objects, which rectifies images with geometric information to compensate for the perspective distortion before feeding it to the CNN detector module, typically a CNN-based detector like YOLO or MASK RCNN. By dealing with the perspective distortion in advance, we eliminate the need for the CNN detector to learn that. Experiments show that this approach significantly boosts the detection performance. Besides, it effectively reduces the number of training images required. In addition to the novel detection framework proposed, we also release an RGBD dataset and source code for hazmat sign detection. To the best of our knowledge, this is the first work of image rectification for CNN-based object detection, and the dataset is the first public available hazmat sign detection dataset with RGB-D sensors.
ROSep 23, 2019
Path Planning Tolerant to Degraded Locomotion ConditionsXiaoling Long, Sören Schwertfeger
Mobile robots, especially those driving outdoors and in unstructured terrain, sometimes suffer from failures and errors in locomotion, like unevenly pressurized or flat tires, loose axes or de-tracked tracks. Those are errors that go unnoticed by the odometry of the robot. Other factors that influence the locomotion performance of the robot, like the weight and distribution of the payload, the terrain over which the robot is driving or the battery charge could not be compensated for by the PID speed or position controller of the robot, because of the physical limits of the system. Traditional planning systems are oblivious to those problems and may thus plan unfeasible trajectories. Also, the path following modules oblivious to those problems will generate sub-optimal motion patterns, if they can get to the goal at all. In this paper, we present an adaptive path planning algorithm that is tolerant to such degraded locomotion conditions. We do this by constantly observing the executed motions of the robot via simultaneously localization and mapping (SLAM). From the executed path and the given motion commands, we constantly on the fly collect and cluster motion primitives (MP), which are in turn used for planning. Therefore the robot can automatically detect and adapt to different locomotion conditions and reflect those in the planned paths.
ROSep 17, 2019
Configuration-Space Flipper Planning on 3D TerrainYijun Yuan, Qingwen Xu, Sören Schwertfeger
Flippers are essential components of tracked robot locomotion systems for unstructured terrain, especially within a rescue scenario. Achieving full and semi-autonomy for such rescue robots is the goal of many research efforts. In this work, we propose an algorithm to plan the morphologies of a small rescue robot with four flippers over 3D ground without any extra sensor, such as pressure sensor. To achieve the goal, we simplify the rescue robot as a skeleton on inflated terrain. Its morphology can be represented by configurations of several parameters. Then we plan the mobile movement on 3D terrain with four individually manipulated flippers. We perform real robot experiments on three different obstacles. The results show that we move the flippers very effectively and are thus able to tackle those terrains very well.
ROJun 12, 2019
Adaptive Navigation Scheme for Optimal Deep-Sea Localization Using Multimodal Perception CuesArturo Gomez Chavez, Qingwen Xu, Christian A. Mueller et al.
Underwater robot interventions require a high level of safety and reliability. A major challenge to address is a robust and accurate acquisition of localization estimates, as it is a prerequisite to enable more complex tasks, e.g. floating manipulation and mapping. State-of-the-art navigation in commercial operations, such as oil & gas production (OGP), rely on costly instrumentation. These can be partially replaced or assisted by visual navigation methods, especially in deep-sea scenarios where equipment deployment has high costs and risks. Our work presents a multimodal approach that adapts state-of-the-art methods from on-land robotics, i.e., dense point cloud generation in combination with plane representation and registration, to boost underwater localization performance. A two-stage navigation scheme is proposed that initially generates a coarse probabilistic map of the workspace, which is used to filter noise from computed point clouds and planes in the second stage. Furthermore, an adaptive decision-making approach is introduced that determines which perception cues to incorporate into the localization filter to optimize accuracy and computation performance. Our approach is investigated first in simulation and then validated with data from field trials in OGP monitoring and maintenance scenarios.
ROMay 27, 2019
Heterogeneous Multi-sensor Calibration based on Graph OptimizationHongyu Chen, Sören Schwertfeger
Many robotics and mapping systems contain multiple sensors to perceive the environment. Extrinsic parameter calibration, the identification of the position and rotation transform between the frames of the different sensors, is critical to fuse data from different sensors. When obtaining multiple camera to camera, lidar to camera and lidar to lidar calibration results, inconsistencies are likely. We propose a graph-based method to refine the relative poses of the different sensors. We demonstrate our approach using our mapping robot platform, which features twelve sensors that are to be calibrated. The experimental results confirm that the proposed algorithm yields great performance.
ROMay 23, 2019
Towards Generation and Evaluation of Comprehensive Mapping Robot DatasetsHongyu Chen, Xiting Zhao, Jianwen Luo et al.
This paper presents a fully hardware synchronized mapping robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. We also employ a professional, static 3D scanner for ground truth map collection. Three datasets are generated to evaluate the performance of mapping algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. The mapping and evaluation procedures are made in a very easily reproducible manner for maximum comparability. In the end we can draw a couple of conclusions about the tested SLAM algorithms.
CVMay 23, 2019
Depth Estimation on Underwater Omni-directional Images Using a Deep Neural NetworkHaofei Kuang, Qingwen Xu, Sören Schwertfeger
In this work, we exploit a depth estimation Fully Convolutional Residual Neural Network (FCRN) for in-air perspective images to estimate the depth of underwater perspective and omni-directional images. We train one conventional and one spherical FCRN for underwater perspective and omni-directional images, respectively. The spherical FCRN is derived from the perspective FCRN via a spherical longitude-latitude mapping. For that, the omni-directional camera is modeled as a sphere, while images captured by it are displayed in the longitude-latitude form. Due to the lack of underwater datasets, we synthesize images in both data-driven and theoretical ways, which are used in training and testing. Finally, experiments are conducted on these synthetic images and results are displayed in both qualitative and quantitative way. The comparison between ground truth and the estimated depth map indicates the effectiveness of our method.
ROMay 8, 2019
Configuration-Space Flipper Planning for Rescue RobotsYijun Yuan, Letong Wang, Sören Schwertfeger
For rescue robots, flipper endows the robot with additional ability to pass through various terrain. Autonomous motion becomes more important. In recent work autonomy is done by either planning with several special states or based on collected data. We are considering if it is possible to find a way to build continues states without collecting old trail data. In this paper, we first model the possible states as a global planning path with parameter configuration of the scene. Then, we follows the path to achieve the autonomous run. We plot the morphology of each path points to show the correctness of the path and implement a simple path following on real robot to demonstrate the performance of our algorithm.
ROJan 15, 2019
Learning Autonomous Exploration and Mapping with Semantic VisionXiangyang Zhi, Xuming He, Sören Schwertfeger
We address the problem of autonomous exploration and mapping for a mobile robot using visual inputs. Exploration and mapping is a well-known and key problem in robotics, the goal of which is to enable a robot to explore a new environment autonomously and create a map for future usage. Different to classical methods, we propose a learning-based approach this work based on semantic interpretation of visual scenes. Our method is based on a deep network consisting of three modules: semantic segmentation network, mapping using camera geometry and exploration action network. All modules are differentiable, so the whole pipeline is trained end-to-end based on actor-critic framework. Our network makes action decision step by step and generates the free space map simultaneously. To our best knowledge, this is the first algorithm that formulate exploration and mapping into learning framework. We validate our approach in simulated real world environments and demonstrate performance gains over competitive baseline approaches.
RONov 26, 2018
Fast Gaussian Process Occupancy MapsYijun Yuan, Haofei Kuang, Sören Schwertfeger
In this paper, we demonstrate our work on Gaussian Process Occupancy Mapping (GPOM). We concentrate on the inefficiency of the frame computation of the classical GPOM approaches. In robotics, most of the algorithms are required to run in real time. However, the high cost of computation makes the classical GPOM less useful. In this paper we dont try to optimize the Gaussian Process itself, instead, we focus on the application. By analyzing the time cost of each step of the algorithm, we find a way that to reduce the cost while maintaining a good performance compared to the general GPOM framework. From our experiments, we can find that our model enables GPOM to run online and achieve a relatively better quality than the classical GPOM.
RONov 13, 2018
Improved Fourier Mellin Invariant for Robust Rotation Estimation with Omni-camerasQingwen Xu, Arturo Gomez Chavez, Heiko Bülow et al.
Spectral methods such as the improved Fourier Mellin Invariant (iFMI) transform have proved faster, more robust and accurate than feature based methods on image registration. However, iFMI is restricted to work only when the camera moves in 2D space and has not been applied on omni-cameras images so far. In this work, we extend the iFMI method and apply a motion model to estimate an omni-camera's pose when it moves in 3D space. This is particularly useful in field robotics applications to get a rapid and comprehensive view of unstructured environments, and to estimate robustly the robot pose. In the experiment section, we compared the extended iFMI method against ORB and AKAZE feature based approaches on three datasets showing different type of environments: office, lawn and urban scenery (MPI-omni dataset). The results show that our method boosts the accuracy of the robot pose estimation two to four times with respect to the feature registration techniques, while offering lower processing times. Furthermore, the iFMI approach presents the best performance against motion blur typically present in mobile robotics.
RONov 13, 2018
Topological Area Graph Generation and its Application to Path PlanningJiawei Hou, Yijun Yuan, Sören Schwertfeger
Representing a scanned map of the real environment as a topological structure is an important research in robotics. %is currently an important research. Since topological representations of maps save a huge amount of map storage space and online computing time, they are widely used in fields such as path planning, map matching, and semantic mapping. We propose a novel topological map representation, the Area Graph, in which the vertices represent areas and edges represent passages. The Area Graph is developed from a pruned Voronoi Graph, the Topology Graph. The paper also presents path planning as one application for the Area Graph. For that, we derive a so-called Passage Graph from the Area Graph. Because our algorithm segments the map as a set of areas, the first experiment compares the results of the Area Graph with that of state-of-the-art segmentation approaches, which proved that our method effectively prevented over-segmentation. Then the second experiment shows the superiority of our method over the traditional A* planning algorithm.
RONov 5, 2018
Incrementally Building Topology Graphs via Distance MapsYijun Yuan, Sören Schwertfeger
Mapping is an essential task for mobile robots and topological representation often works as a basis for the various applications. In this paper, a novel framework that can build topological maps incrementally is proposed. The algorithm is using a distance map, and in our framework the topological map can grow as we append more sensor data to the map. To demonstrate our algorithm, we show the result of the distance map based method on several popular maps and run the incremental framework with raw sensor data to have a growing topological map, as an example of a robot exploring the environment.