Sören Schwertfeger

h-index23

34papers

368citations

Novelty43%

AI Score45

Ranked #43,949 of 194,257 authors (top 23%)#1,202 in RO (top 18%)

34 Papers

7.7ROMar 31Code

Generation of Indoor Open Street Maps for Robot Navigation from CAD Files

Jiajie Zhang, Shenrui Wu, Xu Ma et al.

The deployment of autonomous mobile robots is predicated on the availability of environmental maps, yet conventional generation via SLAM (Simultaneous Localization and Mapping) suffers from significant limitations in time, labor, and robustness, particularly in dynamic, large-scale indoor environments where map obsolescence can lead to critical localization failures. To address these challenges, this paper presents a complete and automated system for converting architectural Computer-Aided Design (CAD) files into a hierarchical topometric OpenStreetMap (OSM) representation, tailored for robust life-long robot navigation. Our core methodology involves a multi-stage pipeline that first isolates key structural layers from the raw CAD data and then employs an AreaGraph-based topological segmentation to partition the building layout into a hierarchical graph of navigable spaces. This process yields a comprehensive and semantically rich map, further enhanced by automatically associating textual labels from the CAD source and cohesively merging multiple building floors into a unified, topologically-correct model. By leveraging the permanent structural information inherent in CAD files, our system circumvents the inefficiencies and fragility of SLAM, offering a practical and scalable solution for deploying robots in complex indoor spaces. The software is encapsulated within an intuitive Graphical User Interface (GUI) to facilitate practical use. The code and dataset are available at https://github.com/jiajiezhang7/osmAG-from-cad.

3.7CVMay 25, 2022Code

Spotlights: Probing Shapes from Spherical Viewpoints

Jiaxin Wei, Lige Liu, Ran Cheng et al.

Recent years have witnessed the surge of learned representations that directly build upon point clouds. Though becoming increasingly expressive, most existing representations still struggle to generate ordered point sets. Inspired by spherical multi-view scanners, we propose a novel sampling model called Spotlights to represent a 3D shape as a compact 1D array of depth values. It simulates the configuration of cameras evenly distributed on a sphere, where each virtual camera casts light rays from its principal point through sample points on a small concentric spherical cap to probe for the possible intersections with the object surrounded by the sphere. The structured point cloud is hence given implicitly as a function of depths. We provide a detailed geometric analysis of this new sampling scheme and prove its effectiveness in the context of the point cloud completion task. Experimental results on both synthetic and real data demonstrate that our method achieves competitive accuracy and consistency while having a significantly reduced computational cost. Furthermore, we show superior performance on the downstream point cloud registration task over state-of-the-art completion methods.

2.2ROAug 1, 2024Code

High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition Datasets

Jian Li, Bowen Xu, Sören Schwertfeger

Robotic datasets are important for scientific benchmarking and developing algorithms, for example for Simultaneous Localization and Mapping (SLAM). Modern robotic datasets feature video data of high resolution and high framerates. Storing and sharing those datasets becomes thus very costly, especially if more than one camera is used for the datasets. It is thus essential to store this video data in a compressed format. This paper investigates the use of modern video encoders for robotic datasets. We provide a software that can replay mp4 videos within ROS 1 and ROS 2 frameworks, supporting the synchronized playback in simulated time. Furthermore, the paper evaluates different encoders and their settings to find optimal configurations in terms of resulting size, quality and encoding time. Through this work we show that it is possible to store and share even highest quality video datasets within reasonable storage constraints.

18.9CVAug 5, 2021Code

Video Contrastive Learning with Global Context

Haofei Kuang, Yi Zhu, Zhi Zhang et al.

Contrastive learning has revolutionized self-supervised image representation learning field, and recently been adapted to video domain. One of the greatest advantages of contrastive learning is that it allows us to flexibly define powerful loss objectives as long as we can find a reasonable way to formulate positive and negative samples to contrast. However, existing approaches rely heavily on the short-range spatiotemporal salience to form clip-level contrastive signals, thus limit themselves from using global context. In this paper, we propose a new video-level contrastive learning method based on segments to formulate positive pairs. Our formulation is able to capture global context in a video, thus robust to temporal content change. We also incorporate a temporal order regularization term to enforce the inherent sequential structure of videos. Extensive experiments show that our video-level contrastive learning framework (VCLR) is able to outperform previous state-of-the-arts on five video datasets for downstream action classification, action localization and video retrieval. Code is available at https://github.com/amazon-research/video-contrastive-learning.

6.2CVNov 26, 2025

From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings

Jiajie Zhang, Sören Schwertfeger, Alexander Kleiner

We present a novel unsupervised framework to unlock vast unlabeled human demonstration data from continuous industrial video streams for Vision-Language-Action (VLA) model pre-training. Our method first trains a lightweight motion tokenizer to encode motion dynamics, then employs an unsupervised action segmenter leveraging a novel "Latent Action Energy" metric to discover and segment semantically coherent action primitives. The pipeline outputs both segmented video clips and their corresponding latent action sequences, providing structured data directly suitable for VLA pre-training. Evaluations on public benchmarks and a proprietary electric motor assembly dataset demonstrate effective segmentation of key tasks performed by humans at workstations. Further clustering and quantitative assessment via a Vision-Language Model confirm the semantic coherence of the discovered action primitives. To our knowledge, this is the first fully automated end-to-end system for extracting and organizing VLA pre-training data from unstructured industrial videos, offering a scalable solution for embodied AI integration in manufacturing.

25.3ROFeb 21, 2024

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

Yumeng Liu, Yaxun Yang, Youzhuo Wang et al.

In this paper, we introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns, enriched by multi-view and multimodal visual data. Utilizing a teleoperation system, we seamlessly synchronize human-robot hand poses in real time. This collection of human-like motions is crucial for training dexterous hands to mimic human movements more naturally and precisely. RealDex holds immense promise in advancing humanoid robot for automated perception, cognition, and manipulation in real-world scenarios. Moreover, we introduce a cutting-edge dexterous grasping motion generation framework, which aligns with human experience and enhances real-world applicability through effectively utilizing Multimodal Large Language Models. Extensive experiments have demonstrated the superior performance of our method on RealDex and other open datasets. The complete dataset and code will be made available upon the publication of this work.

8.3ROMar 11, 2024

3DRef: 3D Dataset and Benchmark for Reflection Detection in RGB and Lidar Data

Xiting Zhao, Sören Schwertfeger

Reflective surfaces present a persistent challenge for reliable 3D mapping and perception in robotics and autonomous systems. However, existing reflection datasets and benchmarks remain limited to sparse 2D data. This paper introduces the first large-scale 3D reflection detection dataset containing more than 50,000 aligned samples of multi-return Lidar, RGB images, and 2D/3D semantic labels across diverse indoor environments with various reflections. Textured 3D ground truth meshes enable automatic point cloud labeling to provide precise ground truth annotations. Detailed benchmarks evaluate three Lidar point cloud segmentation methods, as well as current state-of-the-art image segmentation networks for glass and mirror detection. The proposed dataset advances reflection detection by providing a comprehensive testbed with precise global alignment, multi-modal data, and diverse reflective objects and materials. It will drive future research towards reliable reflection detection. The dataset is publicly available at http://3dref.github.io

6.9ROFeb 2, 2022Code

Accurate calibration of multi-perspective cameras from a generalization of the hand-eye constraint

Yifu Wang, Wenqing Jiang, Kun Huang et al.

Multi-perspective cameras are quickly gaining importance in many applications such as smart vehicles and virtual or augmented reality. However, a large system size or absence of overlap in neighbouring fields-of-view often complicate their calibration. We present a novel solution which relies on the availability of an external motion capture system. Our core contribution consists of an extension to the hand-eye calibration problem which jointly solves multi-eye-to-base problems in closed form. We furthermore demonstrate its equivalence to the multi-eye-in-hand problem. The practical validity of our approach is supported by our experiments, indicating that the method is highly efficient and accurate, and outperforms existing closed-form alternatives.

3.0RONov 16, 2021

Hierarchical Topometric Representation of 3D Robotic Maps

Zhenpeng He, Hao Sun, Jiawei Hou et al.

In this paper, we propose a method for generating a hierarchical, volumetric topological map from 3D point clouds. There are three basic hierarchical levels in our map: $storey - region - volume$. The advantages of our method are reflected in both input and output. In terms of input, we accept multi-storey point clouds and building structures with sloping roofs or ceilings. In terms of output, we can generate results with metric information of different dimensionality, that are suitable for different robotics applications. The algorithm generates the volumetric representation by generating $volumes$ from a 3D voxel occupancy map. We then add $passage$s (connections between $volumes$), combine small $volumes$ into a big $region$ and use a 2D segmentation method for better topological representation. We evaluate our method on several freely available datasets. The experiments highlight the advantages of our approach.

2.2RONov 17, 2020

Improved Visual-Inertial Localization for Low-cost Rescue Robots

Xiaoling Long, Qingwen Xu, Yijun Yuan et al.

This paper improves visual-inertial systems to boost the localization accuracy for low-cost rescue robots. When robots traverse on rugged terrain, the performance of pose estimation suffers from big noise on the measurements of the inertial sensors due to ground contact forces, especially for low-cost sensors. Therefore, we propose \textit{Threshold}-based and \textit{Dynamic Time Warping}-based methods to detect abnormal measurements and mitigate such faults. The two methods are embedded into the popular VINS-Mono system to evaluate their performance. Experiments are performed on simulation and real robot data, which show that both methods increase the pose estimation accuracy. Moreover, the \textit{Threshold}-based method performs better when the noise is small and the \textit{Dynamic Time Warping}-based one shows greater potential on large noise.

10.4ROJul 23, 2020

Advanced Mapping Robot and High-Resolution Dataset

Hongyu Chen, Zhijie Yang, Xiting Zhao et al.

This paper presents a fully hardware synchronized mapping robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. Nine high-resolution cameras and two 32-beam 3D Lidars were used along with a professional, static 3D scanner for ground truth map collection. With all the sensors calibrated on the mapping robot, three datasets are collected to evaluate the performance of mapping algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. We provide the datasets for download and the mapping and evaluation procedures are made in a very easily reproducible manner for maximum comparability. We have also conducted a survey on available robotics-related datasets and compiled a big table with those datasets and a number of properties of them.

12.2ROMar 11, 2020Code

Self-supervised Point Set Local Descriptors for Point Cloud Registration

Yijun Yuan, Jiawei Hou, Andreas Nüchter et al.

In this work, we propose to learn local descriptors for point clouds in a self-supervised manner. In each iteration of the training, the input of the network is merely one unlabeled point cloud. On top of our previous work, that directly solves the transformation between two point sets in one step without correspondences, the proposed method is able to train from one point cloud, by supervising its self-rotation, that we randomly generate. The whole training requires no manual annotation. In several experiments we evaluate the performance of our method on various datasets and compare to other state of the art algorithms. The results show, that our self-supervised learned descriptor achieves equivalent or even better performance than the supervised learned model, while being easier to train and not requiring labeled data.

4.1ROMar 1, 2020

Non-iterative One-step Solution for Point Set Registration Problem on Pose Estimation without Correspondence

Yijun Yuan, Dorit Borrmann, Andreas Nüchter et al.

In this work, we propose to directly find the one-step solution for the point set registration problem without correspondences. Inspired by the Kernel Correlation method, we consider the fully connected objective function between two point sets, thus avoiding the computation of correspondences. By utilizing least square minimization, the transformed objective function is directly solved with existing well-known closed-form solutions, e.g., singular value decomposition, that is usually used for given correspondences. However, using equal weights of costs for each connection will degenerate the solution due to the large influence of distant pairs. Thus, we additionally set a scale on each term to avoid high costs on non-important pairs. As in feature-based registration methods, the similarity between descriptors of points determines the scaling weight. Given the weights, we get a one step solution. As the runtime is in $\mathcal O (n^2)$, we also propose a variant with keypoints that strongly reduces the cost. The experiments show that the proposed method gives a one-step solution without an initial guess. Our method exhibits competitive outlier robustness and accuracy, compared to various other methods, and it is more stable in case of large rotations. Additionally, our one-step solution achieves a performance on-par with the state-of-the-art feature based method TEASER.

1.9RODec 3, 2019

Comparison and Evaluation of 2D and 3D Range Sensors

Qingwen Xu, Sören Schwertfeger

For mobile robots range sensors are important to perceive the environment. Sensors that can measure in a 3D volume are especially significant for outdoor robotics, because this environment is often highly unstructured. The quality of the data gathered by those sensors influences all algorithms relying on it. In this paper thus the precision of several 2D and 2.5D sensors is measured at different ranges and different incidence angles. The results of all tests are presented and analyzed.

1.9RODec 3, 2019

Room Detection for Topological Maps

Sören Schwertfeger, Tianyan Yu

Mapping is an important part of many robotic applications. In order to measure the performance of the mapping process we have to measure the quality of its result: the map. The map is essential for robotic algorithms like localization and path planning. Previously it was shown how matched Topology Graphs can be used for map evaluation by comparing the topology of the robot generated map to the topology of a ground truth map. In this paper we are extending the previous work by detecting open areas, for example rooms, in the 2D grid map and adding those to the topological representation. This way we can avoid the unreliable generation of paths in open areas, thus making the Topology Graph generation, and through that also the Topology Graph matching, more stable and robust. The detection applies the alpha shape algorithm for room detection.

1.9RODec 3, 2019

Evaluation of Smartphone IMUs for Small Mobile Search and Rescue Robots

Xiangyang Zhi, Qingwen Xu, Sören Schwertfeger

Small mobile robots are an important class of Search and Rescue Robots. Integrating all required components into such small robots is a difficult engineering task. Smartphones have already been made small, lightweight and cheap by the industry and are thus an excellent candidate as main controller for such robots. In this paper we outline how ROS can be used on Android devices and then evaluate one sensor which is very important for mobile robots: the Inertial Measurement Unit (IMU). Experiments are performed under static and dynamic conditions to measure the error of the IMUs of three smartphones and three professional IMUs. In the experiments we make use of a tracking system and an autonomous mobile robot.

4.9RONov 18, 2019

Fast 2D Map Matching Based on Area Graphs

Jiawei Hou, Haofei Kuang, Sören Schwertfeger

We present a novel area matching algorithm for merging two different 2D grid maps. There are many approaches to address this problem, nevertheless, most previous work is built on some assumptions, such as rigid transformation, or similar scale and modalities of two maps. In this work we propose a 2D map matching algorithm based on area segmentation. We transfer general 2D occupancy grid maps to an area graph representation, then compute the correct results by voting in that space. In the experiments, we compare with a state-of-the-art method applied to the matching of sensor maps with ground truth layout maps. The experiment shows that our algorithm has a better performance on large-scale maps and a faster computation speed.

2.0AINov 15, 2019

Fine-grained Qualitative Spatial Reasoning about Point Positions

Sören Schwertfeger

The ability to persist in the spacial environment is, not only in the robotic context, an essential feature. Positional knowledge is one of the most important aspects of space and a number of methods to represent these information have been developed in the in the research area of spatial cognition. The basic qualitative spatial representation and reasoning techniques are presented in this thesis and several calculi are briefly reviewed. Features and applications of qualitative calculi are summarized. A new calculus for representing and reasoning about qualitative spatial orientation and distances is being designed. It supports an arbitrary level of granularity over ternary relations of points. Ways of improving the complexity of the composition are shown and an implementation of the calculus demonstrates its capabilities. Existing qualitative spatial calculi of positional information are compared to the new approach and possibilities for future research are outlined.

7.3RONov 2, 2019

Furniture Free Mapping using 3D Lidars

Zhenpeng He, Jiawei Hou, Sören Schwertfeger

Mobile robots depend on maps for localization, planning, and other applications. In indoor scenarios, there is often lots of clutter present, such as chairs, tables, other furniture, or plants. While mapping this clutter is important for certain applications, for example navigation, maps that represent just the immobile parts of the environment, i.e. walls, are needed for other applications, like room segmentation or long-term localization. In literature, approaches can be found that use a complete point cloud to remove the furniture in the room and generate a furniture free map. In contrast, we propose a Simultaneous Localization And Mapping (SLAM)-based mobile laser scanning solution. The robot uses an orthogonal pair of Lidars. The horizontal scanner aims to estimate the robot position, whereas the vertical scanner generates the furniture free map. There are three steps in our method: point cloud rearrangement, wall plane detection and semantic labeling. In the experiment, we evaluate the efficiency of removing furniture in a typical indoor environment. We get $99.60\%$ precision in keeping the wall in the 3D result, which shows that our algorithm can remove most of the furniture in the environment. Furthermore, we introduce the application of 2D furniture free mapping for room segmentation.

3.5ROOct 2, 2019

Pose Estimation for Omni-directional Cameras using Sinusoid Fitting

Haofei Kuang, Qingwen Xu, Xiaoling Long et al.

We propose a novel pose estimation method for geometric vision of omni-directional cameras. On the basis of the regularity of the pixel movement after camera pose changes, we formulate and prove the sinusoidal relationship between pixels movement and camera motion. We use the improved Fourier-Mellin invariant (iFMI) algorithm to find the motion of pixels, which was shown to be more accurate and robust than the feature-based methods. While iFMI works only on pin-hole model images and estimates 4 parameters (x, y, yaw, scaling), our method works on panoramic images and estimates the full 6 DoF 3D transform, up to an unknown scale factor. For that we fit the motion of the pixels in the panoramic images, as determined by iFMI, to two sinusoidal functions. The offsets, amplitudes and phase-shifts of the two functions then represent the 3D rotation and translation of the camera between the two images. We perform experiments for 3D rotation, which show that our algorithm outperforms the feature-based methods in accuracy and robustness. We leave the more complex 3D translation experiments for future work.

4.9ROOct 1, 2019

Area Graph: Generation of Topological Maps using the Voronoi Diagram

Jiawei Hou, Yijun Yuan, Sören Schwertfeger

Representing a scanned map of the real environment as a topological structure is an important research topic in robotics. Since topological representations of maps save a huge amount of map storage space and online computing time, they are widely used in fields such as path planning, map matching, and semantic mapping. We use a topological map representation, the Area Graph, in which the vertices represent areas and edges represent passages. The Area Graph is developed from a pruned Voronoi Graph, the Topology Graph. We also employ a simple room detection algorithm to compensate the fact that the Voronoi Graph gets unstable in open areas. We claim that our area segmentation method is superior to state-of-the-art approaches in complex indoor environments and support this claim with a number of experiments.

4.9ROSep 27, 2019

Mapping with Reflection -- Detection and Utilization of Reflection in 3D Lidar Scans

Xiting Zhao, Zhijie Yang, Sören Schwertfeger

This paper presents a method to detect reflection of 3D light detection and ranging (Lidar) scans and uses it to classify the points and also map objects outside the line of sight. Our software uses several approaches to analyze the point cloud, including intensity peak detection, dual return detection, plane fitting, and finding the boundaries. These approaches can classify the point cloud and detect the reflection in it. By mirroring the reflection points on the detected window pane and adding classification labels on the points, we can improve the map quality in a Simultaneous Localization and Mapping (SLAM) framework. Experiments using real scan data and ground truth data showcase the effectiveness of our method.

10.1ROSep 23, 2019

Improving CNN-based Planar Object Detection with Geometric Prior Knowledge

Jianxiong Cai, Jiawei Hou, Yiren Lu et al.

In this paper, we focus on the question: how might mobile robots take advantage of affordable RGB-D sensors for object detection? Although current CNN-based object detectors have achieved impressive results, there are three main drawbacks for practical usage on mobile robots: 1) It is hard and time-consuming to collect and annotate large-scale training sets. 2) It usually needs a long training time. 3) CNN-based object detection shows significant weakness in predicting location. We propose an improved method for the detection of planar objects, which rectifies images with geometric information to compensate for the perspective distortion before feeding it to the CNN detector module, typically a CNN-based detector like YOLO or MASK RCNN. By dealing with the perspective distortion in advance, we eliminate the need for the CNN detector to learn that. Experiments show that this approach significantly boosts the detection performance. Besides, it effectively reduces the number of training images required. In addition to the novel detection framework proposed, we also release an RGBD dataset and source code for hazmat sign detection. To the best of our knowledge, this is the first work of image rectification for CNN-based object detection, and the dataset is the first public available hazmat sign detection dataset with RGB-D sensors.

1.9ROSep 23, 2019

Path Planning Tolerant to Degraded Locomotion Conditions

Xiaoling Long, Sören Schwertfeger

Mobile robots, especially those driving outdoors and in unstructured terrain, sometimes suffer from failures and errors in locomotion, like unevenly pressurized or flat tires, loose axes or de-tracked tracks. Those are errors that go unnoticed by the odometry of the robot. Other factors that influence the locomotion performance of the robot, like the weight and distribution of the payload, the terrain over which the robot is driving or the battery charge could not be compensated for by the PID speed or position controller of the robot, because of the physical limits of the system. Traditional planning systems are oblivious to those problems and may thus plan unfeasible trajectories. Also, the path following modules oblivious to those problems will generate sub-optimal motion patterns, if they can get to the goal at all. In this paper, we present an adaptive path planning algorithm that is tolerant to such degraded locomotion conditions. We do this by constantly observing the executed motions of the robot via simultaneously localization and mapping (SLAM). From the executed path and the given motion commands, we constantly on the fly collect and cluster motion primitives (MP), which are in turn used for planning. Therefore the robot can automatically detect and adapt to different locomotion conditions and reflect those in the planned paths.

1.9ROSep 17, 2019

Configuration-Space Flipper Planning on 3D Terrain

Yijun Yuan, Qingwen Xu, Sören Schwertfeger

Flippers are essential components of tracked robot locomotion systems for unstructured terrain, especially within a rescue scenario. Achieving full and semi-autonomy for such rescue robots is the goal of many research efforts. In this work, we propose an algorithm to plan the morphologies of a small rescue robot with four flippers over 3D ground without any extra sensor, such as pressure sensor. To achieve the goal, we simplify the rescue robot as a skeleton on inflated terrain. Its morphology can be represented by configurations of several parameters. Then we plan the mobile movement on 3D terrain with four individually manipulated flippers. We perform real robot experiments on three different obstacles. The results show that we move the flippers very effectively and are thus able to tackle those terrains very well.

3.5ROJun 12, 2019

Adaptive Navigation Scheme for Optimal Deep-Sea Localization Using Multimodal Perception Cues

Arturo Gomez Chavez, Qingwen Xu, Christian A. Mueller et al.

Underwater robot interventions require a high level of safety and reliability. A major challenge to address is a robust and accurate acquisition of localization estimates, as it is a prerequisite to enable more complex tasks, e.g. floating manipulation and mapping. State-of-the-art navigation in commercial operations, such as oil & gas production (OGP), rely on costly instrumentation. These can be partially replaced or assisted by visual navigation methods, especially in deep-sea scenarios where equipment deployment has high costs and risks. Our work presents a multimodal approach that adapts state-of-the-art methods from on-land robotics, i.e., dense point cloud generation in combination with plane representation and registration, to boost underwater localization performance. A two-stage navigation scheme is proposed that initially generates a coarse probabilistic map of the workspace, which is used to filter noise from computed point clouds and planes in the second stage. Furthermore, an adaptive decision-making approach is introduced that determines which perception cues to incorporate into the localization filter to optimize accuracy and computation performance. Our approach is investigated first in simulation and then validated with data from field trials in OGP monitoring and maintenance scenarios.

9.2ROMay 27, 2019

Heterogeneous Multi-sensor Calibration based on Graph Optimization

Hongyu Chen, Sören Schwertfeger

Many robotics and mapping systems contain multiple sensors to perceive the environment. Extrinsic parameter calibration, the identification of the position and rotation transform between the frames of the different sensors, is critical to fuse data from different sensors. When obtaining multiple camera to camera, lidar to camera and lidar to lidar calibration results, inconsistencies are likely. We propose a graph-based method to refine the relative poses of the different sensors. We demonstrate our approach using our mapping robot platform, which features twelve sensors that are to be calibrated. The experimental results confirm that the proposed algorithm yields great performance.

7.3ROMay 23, 2019

Towards Generation and Evaluation of Comprehensive Mapping Robot Datasets

Hongyu Chen, Xiting Zhao, Jianwen Luo et al.

This paper presents a fully hardware synchronized mapping robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. We also employ a professional, static 3D scanner for ground truth map collection. Three datasets are generated to evaluate the performance of mapping algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. The mapping and evaluation procedures are made in a very easily reproducible manner for maximum comparability. In the end we can draw a couple of conclusions about the tested SLAM algorithms.

2.6CVMay 23, 2019

Depth Estimation on Underwater Omni-directional Images Using a Deep Neural Network

Haofei Kuang, Qingwen Xu, Sören Schwertfeger

In this work, we exploit a depth estimation Fully Convolutional Residual Neural Network (FCRN) for in-air perspective images to estimate the depth of underwater perspective and omni-directional images. We train one conventional and one spherical FCRN for underwater perspective and omni-directional images, respectively. The spherical FCRN is derived from the perspective FCRN via a spherical longitude-latitude mapping. For that, the omni-directional camera is modeled as a sphere, while images captured by it are displayed in the longitude-latitude form. Due to the lack of underwater datasets, we synthesize images in both data-driven and theoretical ways, which are used in training and testing. Finally, experiments are conducted on these synthetic images and results are displayed in both qualitative and quantitative way. The comparison between ground truth and the estimated depth map indicates the effectiveness of our method.

3.5ROMay 8, 2019

Configuration-Space Flipper Planning for Rescue Robots

Yijun Yuan, Letong Wang, Sören Schwertfeger

For rescue robots, flipper endows the robot with additional ability to pass through various terrain. Autonomous motion becomes more important. In recent work autonomy is done by either planning with several special states or based on collected data. We are considering if it is possible to find a way to build continues states without collecting old trail data. In this paper, we first model the possible states as a global planning path with parameter configuration of the scene. Then, we follows the path to achieve the autonomous run. We plot the morphology of each path points to show the correctness of the path and implement a simple path following on real robot to demonstrate the performance of our algorithm.

8.3ROJan 15, 2019

Learning Autonomous Exploration and Mapping with Semantic Vision

Xiangyang Zhi, Xuming He, Sören Schwertfeger

We address the problem of autonomous exploration and mapping for a mobile robot using visual inputs. Exploration and mapping is a well-known and key problem in robotics, the goal of which is to enable a robot to explore a new environment autonomously and create a map for future usage. Different to classical methods, we propose a learning-based approach this work based on semantic interpretation of visual scenes. Our method is based on a deep network consisting of three modules: semantic segmentation network, mapping using camera geometry and exploration action network. All modules are differentiable, so the whole pipeline is trained end-to-end based on actor-critic framework. Our network makes action decision step by step and generates the free space map simultaneously. To our best knowledge, this is the first algorithm that formulate exploration and mapping into learning framework. We validate our approach in simulated real world environments and demonstrate performance gains over competitive baseline approaches.

5.3RONov 26, 2018Code

Fast Gaussian Process Occupancy Maps

Yijun Yuan, Haofei Kuang, Sören Schwertfeger

In this paper, we demonstrate our work on Gaussian Process Occupancy Mapping (GPOM). We concentrate on the inefficiency of the frame computation of the classical GPOM approaches. In robotics, most of the algorithms are required to run in real time. However, the high cost of computation makes the classical GPOM less useful. In this paper we dont try to optimize the Gaussian Process itself, instead, we focus on the application. By analyzing the time cost of each step of the algorithm, we find a way that to reduce the cost while maintaining a good performance compared to the general GPOM framework. From our experiments, we can find that our model enables GPOM to run online and achieve a relatively better quality than the classical GPOM.

2.9RONov 13, 2018

Topological Area Graph Generation and its Application to Path Planning

Jiawei Hou, Yijun Yuan, Sören Schwertfeger

Representing a scanned map of the real environment as a topological structure is an important research in robotics. %is currently an important research. Since topological representations of maps save a huge amount of map storage space and online computing time, they are widely used in fields such as path planning, map matching, and semantic mapping. We propose a novel topological map representation, the Area Graph, in which the vertices represent areas and edges represent passages. The Area Graph is developed from a pruned Voronoi Graph, the Topology Graph. The paper also presents path planning as one application for the Area Graph. For that, we derive a so-called Passage Graph from the Area Graph. Because our algorithm segments the map as a set of areas, the first experiment compares the results of the Area Graph with that of state-of-the-art segmentation approaches, which proved that our method effectively prevented over-segmentation. Then the second experiment shows the superiority of our method over the traditional A* planning algorithm.

4.2RONov 5, 2018

Incrementally Building Topology Graphs via Distance Maps

Yijun Yuan, Sören Schwertfeger

Mapping is an essential task for mobile robots and topological representation often works as a basis for the various applications. In this paper, a novel framework that can build topological maps incrementally is proposed. The algorithm is using a distance map, and in our framework the topological map can grow as we append more sensor data to the map. To demonstrate our algorithm, we show the result of the distance map based method on several popular maps and run the incremental framework with raw sensor data to have a growing topological map, as an example of a robot exploring the environment.