Juan Nieto

RO
77papers
6,825citations
Novelty47%
AI Score30

77 Papers

ROJan 4, 2021Code
Volumetric Grasping Network: Real-time 6 DOF Grasp Detection in Clutter

Michel Breyer, Jen Jen Chung, Lionel Ott et al.

General robot grasping in clutter requires the ability to synthesize grasps that work for previously unseen objects and that are also robust to physical interactions, such as collisions with other objects in the scene. In this work, we design and train a network that predicts 6 DOF grasps from 3D scene information gathered from an on-board sensor such as a wrist-mounted depth camera. Our proposed Volumetric Grasping Network (VGN) accepts a Truncated Signed Distance Function (TSDF) representation of the scene and directly outputs the predicted grasp quality and the associated gripper orientation and opening width for each voxel in the queried 3D volume. We show that our approach can plan grasps in only 10 ms and is able to clear 92% of the objects in real-world clutter removal experiments without the need for explicit collision checking. The real-time capability opens up the possibility for closed-loop grasp planning, allowing robots to handle disturbances, recover from errors and provide increased robustness. Code is available at https://github.com/ethz-asl/vgn.

ROOct 19, 2020Code
A Unified Approach for Autonomous Volumetric Exploration of Large Scale Environments under Severe Odometry Drift

Lukas Schmid, Victor Reijgwart, Lionel Ott et al.

Exploration is a fundamental problem in robot autonomy. A major limitation, however, is that during exploration robots oftentimes have to rely on on-board systems alone for state estimation, accumulating significant drift over time in large environments. Drift can be detrimental to robot safety and exploration performance. In this work, a submap-based, multi-layer approach for both mapping and planning is proposed to enable safe and efficient volumetric exploration of large scale environments despite odometry drift. The central idea of our approach combines local (temporally and spatially) and global mapping to guarantee safety and efficiency. Similarly, our planning approach leverages the presented map to compute global volumetric frontiers in a changing global map and utilizes the nature of exploration dealing with partial information for efficient local and global planning. The presented system is thoroughly evaluated and shown to outperform state of the art methods even under drift-free conditions. Our system, termed GLoca}, will be made available open source.

ROApr 27, 2020Code
Voxgraph: Globally Consistent, Volumetric Mapping using Signed Distance Function Submaps

Victor Reijgwart, Alexander Millane, Helen Oleynikova et al.

Globally consistent dense maps are a key requirement for long-term robot navigation in complex environments. While previous works have addressed the challenges of dense mapping and global consistency, most require more computational resources than may be available on-board small robots. We propose a framework that creates globally consistent volumetric maps on a CPU and is lightweight enough to run on computationally constrained platforms. Our approach represents the environment as a collection of overlapping Signed Distance Function (SDF) submaps, and maintains global consistency by computing an optimal alignment of the submap collection. By exploiting the underlying SDF representation, we generate correspondence free constraints between submap pairs that are computationally efficient enough to optimize the global problem each time a new submap is added. We deploy the proposed system on a hexacopter Micro Aerial Vehicle (MAV) with an Intel i7-8650U CPU in two realistic scenarios: mapping a large-scale area using a 3D LiDAR, and mapping an industrial space using an RGB-D camera. In the large-scale outdoor experiments, the system optimizes a 120x80m map in less than 4s and produces absolute trajectory RMSEs of less than 1m over 400m trajectories. Our complete system, called voxgraph, is available as open source.

RODec 5, 2019Code
VersaVIS: An Open Versatile Multi-Camera Visual-Inertial Sensor Suite

Florian Tschopp, Michael Riner, Marius Fehr et al.

Robust and accurate pose estimation is crucial for many applications in mobile robotics. Extending visual Simultaneous Localization and Mapping (SLAM) with other modalities such as an inertial measurement unit (IMU) can boost robustness and accuracy. However, for a tight sensor fusion, accurate time synchronization of the sensors is often crucial. Changing exposure times, internal sensor filtering, multiple clock sources and unpredictable delays from operation system scheduling and data transfer can make sensor synchronization challenging. In this paper, we present VersaVIS, an Open Versatile Multi-Camera Visual-Inertial Sensor Suite aimed to be an efficient research platform for easy deployment, integration and extension for many mobile robotic applications. VersaVIS provides a complete, open-source hardware, firmware and software bundle to perform time synchronization of multiple cameras with an IMU featuring exposure compensation, host clock translation and independent and stereo camera triggering. The sensor suite supports a wide range of cameras and IMUs to match the requirements of the application. The synchronization accuracy of the framework is evaluated on multiple experiments achieving timing accuracy of less than 1 ms. Furthermore, the applicability and versatility of the sensor suite is demonstrated in multiple applications including visual-inertial SLAM, multi-camera applications, multimodal mapping, reconstruction and object based mapping.

ROSep 27, 2019Code
SegMap: Segment-based mapping and localization using data-driven descriptors

Renaud Dubé, Andrei Cramariuc, Daniel Dugas et al.

Precisely estimating a robot's pose in a prior, global map is a fundamental capability for mobile robotics, e.g. autonomous driving or exploration in disaster zones. This task, however, remains challenging in unstructured, dynamic environments, where local features are not discriminative enough and global scene descriptors only provide coarse information. We therefore present SegMap: a map representation solution for localization and mapping based on the extraction of segments in 3D point clouds. Working at the level of segments offers increased invariance to view-point and local structural changes, and facilitates real-time processing of large-scale 3D data. SegMap exploits a single compact data-driven descriptor for performing multiple tasks: global localization, 3D dense map reconstruction, and semantic information extraction. The performance of SegMap is evaluated in multiple urban driving and search and rescue experiments. We show that the learned SegMap descriptor has superior segment retrieval capabilities, compared to state-of-the-art handcrafted descriptors. In consequence, we achieve a higher localization accuracy and a 6% increase in recall over state-of-the-art. These segment-based localizations allow us to reduce the open-loop odometry drift by up to 50%. SegMap is open-source available along with easy to run demonstrations.

ROJul 22, 2019Code
Revisiting Boustrophedon Coverage Path Planning as a Generalized Traveling Salesman Problem

Rik Bähnemann, Nicholas Lawrance, Jen Jen Chung et al.

In this paper, we present a path planner for low-altitude terrain coverage in known environments with unmanned rotary-wing micro aerial vehicles (MAVs). Airborne systems can assist humanitarian demining by surveying suspected hazardous areas (SHAs) with cameras, ground-penetrating synthetic aperture radar (GPSAR), and metal detectors. Most available coverage planner implementations for MAVs do not consider obstacles and thus cannot be deployed in obstructed environments. We describe an open source framework to perform coverage planning in polygon flight corridors with obstacles. Our planner extends boustrophedon coverage planning by optimizing over different sweep combinations to find the optimal sweep path, and considers obstacles during transition flights between cells. We evaluate the path planner on 320 synthetic maps and show that it is able to solve realistic planning instances fast enough to run in the field. The planner achieves 14% lower path costs than a conventional coverage planner. We validate the planner on a real platform where we show low-altitude coverage over a sloped terrain with trees.

RODec 10, 2018Code
An Open-Source System for Vision-Based Micro-Aerial Vehicle Mapping, Planning, and Flight in Cluttered Environments

Helen Oleynikova, Christian Lanegger, Zachary Taylor et al.

We present an open-source system for Micro-Aerial Vehicle autonomous navigation from vision-based sensing. Our system focuses on dense mapping, safe local planning, and global trajectory generation, especially when using narrow field of view sensors in very cluttered environments. In addition, details about other necessary parts of the system and special considerations for applications in real-world scenarios are presented. We focus our experiments on evaluating global planning, path smoothing, and local planning methods on real maps made on MAVs in realistic search and rescue and industrial inspection scenarios. We also perform thousands of simulations in cluttered synthetic environments, and finally validate the complete system in real-world experiments.

ROApr 25, 2018Code
SegMap: 3D Segment Mapping using Data-Driven Descriptors

Renaud Dubé, Andrei Cramariuc, Daniel Dugas et al.

When performing localization and mapping, working at the level of structure can be advantageous in terms of robustness to environmental changes and differences in illumination. This paper presents SegMap: a map representation solution to the localization and mapping problem based on the extraction of segments in 3D point clouds. In addition to facilitating the computationally intensive task of processing 3D point clouds, working at the level of segments addresses the data compression requirements of real-time single- and multi-robot systems. While current methods extract descriptors for the single task of localization, SegMap leverages a data-driven descriptor in order to extract meaningful features that can also be used for reconstructing a dense 3D map of the environment and for extracting semantic information. This is particularly interesting for navigation tasks and for providing visual feedback to end-users such as robot operators, for example in search and rescue scenarios. These capabilities are demonstrated in multiple urban driving and search and rescue experiments. Our method leads to an increase of area under the ROC curve of 28.3% over current state of the art using eigenvalue based features. We also obtain very similar reconstruction capabilities to a model specifically trained for this task. The SegMap implementation will be made available open-source along with easy to run demonstrations at www.github.com/ethz-asl/segmap. A video demonstration is available at https://youtu.be/CMk4w4eRobg.

ROAug 22, 2017Code
Build Your Own Visual-Inertial Drone: A Cost-Effective and Open-Source Autonomous Drone

Inkyu Sa, Mina Kamel, Michael Burri et al.

This paper describes an approach to building a cost-effective and research grade visual-inertial odometry aided vertical taking-off and landing (VTOL) platform. We utilize an off-the-shelf visual-inertial sensor, an onboard computer, and a quadrotor platform that are factory-calibrated and mass-produced, thereby sharing similar hardware and sensor specifications (e.g., mass, dimensions, intrinsic and extrinsic of camera-IMU systems, and signal-to-noise ratio). We then perform a system calibration and identification enabling the use of our visual-inertial odometry, multi-sensor fusion, and model predictive control frameworks with the off-the-shelf products. This implies that we can partially avoid tedious parameter tuning procedures for building a full system. The complete system is extensively evaluated both indoors using a motion capture system and outdoors using a laser tracker while performing hover and step responses, and trajectory following tasks in the presence of external wind disturbances. We achieve root-mean-square (RMS) pose errors between a reference and actual trajectories of 0.036m, while performing hover. We also conduct relatively long distance flight (~180m) experiments on a farm site and achieve 0.82% drift error of the total distance flight. This paper conveys the insights we acquired about the platform and sensor module and returns to the community as open-source code with tutorial documentation.

ROJan 30, 2017Code
Dynamic System Identification, and Control for a cost effective open-source VTOL MAV

Inkyu Sa, Mina Kamel, Raghav Khanna et al.

This paper describes dynamic system identification, and full control of a cost-effective vertical take-off and landing (VTOL) multi-rotor micro-aerial vehicle (MAV) --- DJI Matrice 100. The dynamics of the vehicle and autopilot controllers are identified using only a built-in IMU and utilized to design a subsequent model predictive controller (MPC). Experimental results for the control performance are evaluated using a motion capture system while performing hover, step responses, and trajectory following tasks in the present of external wind disturbances. We achieve root-mean-square (RMS) errors between the reference and actual trajectory of x=0.021m, y=0.016m, z=0.029m, roll=0.392deg, pitch=0.618deg, and yaw=1.087deg while performing hover. This paper also conveys the insights we have gained about the platform and returned to the community through open-source code, and documentation.

RONov 11, 2016Code
Voxblox: Incremental 3D Euclidean Signed Distance Fields for On-Board MAV Planning

Helen Oleynikova, Zachary Taylor, Marius Fehr et al.

Micro Aerial Vehicles (MAVs) that operate in unstructured, unexplored environments require fast and flexible local planning, which can replan when new parts of the map are explored. Trajectory optimization methods fulfill these needs, but require obstacle distance information, which can be given by Euclidean Signed Distance Fields (ESDFs). We propose a method to incrementally build ESDFs from Truncated Signed Distance Fields (TSDFs), a common implicit surface representation used in computer graphics and vision. TSDFs are fast to build and smooth out sensor noise over many observations, and are designed to produce surface meshes. Meshes allow human operators to get a better assessment of the robot's environment, and set high-level mission goals. We show that we can build TSDFs faster than Octomaps, and that it is more accurate to build ESDFs out of TSDFs than occupancy maps. Our complete system, called voxblox, will be available as open source and runs in real-time on a single CPU core. We validate our approach on-board an MAV, by using our system with a trajectory optimization local planner, entirely on-board and in real-time.

ROFeb 3, 2022
Spatial Computing and Intuitive Interaction: Bringing Mixed Reality and Robotics Together

Jeffrey Delmerico, Roi Poranne, Federica Bogo et al.

Spatial computing -- the ability of devices to be aware of their surroundings and to represent this digitally -- offers novel capabilities in human-robot interaction. In particular, the combination of spatial computing and egocentric sensing on mixed reality devices enables them to capture and understand human actions and translate these to actions with spatial meaning, which offers exciting new possibilities for collaboration between humans and robots. This paper presents several human-robot systems that utilize these capabilities to enable novel robot use cases: mission planning for inspection, gesture-based control, and immersive teleoperation. These works demonstrate the power of mixed reality as a tool for human-robot interaction, and the potential of spatial computing and mixed reality to drive the future of human-robot interaction.

ROJan 18, 2022
CERBERUS: Autonomous Legged and Aerial Robotic Exploration in the Tunnel and Urban Circuits of the DARPA Subterranean Challenge

Marco Tranzatto, Frank Mascarich, Lukas Bernreiter et al.

Autonomous exploration of subterranean environments constitutes a major frontier for robotic systems as underground settings present key challenges that can render robot autonomy hard to achieve. This has motivated the DARPA Subterranean Challenge, where teams of robots search for objects of interest in various underground environments. In response, the CERBERUS system-of-systems is presented as a unified strategy towards subterranean exploration using legged and flying robots. As primary robots, ANYmal quadruped systems are deployed considering their endurance and potential to traverse challenging terrain. For aerial robots, both conventional and collision-tolerant multirotors are utilized to explore spaces too narrow or otherwise unreachable by ground systems. Anticipating degraded sensing conditions, a complementary multi-modal sensor fusion approach utilizing camera, LiDAR, and inertial data for resilient robot pose estimation is proposed. Individual robot pose estimates are refined by a centralized multi-robot map optimization approach to improve the reported location accuracy of detected objects of interest in the DARPA-defined coordinate frame. Furthermore, a unified exploration path planning policy is presented to facilitate the autonomous operation of both legged and aerial robots in complex underground networks. Finally, to enable communication between the robots and the base station, CERBERUS utilizes a ground rover with a high-gain antenna and an optical fiber connection to the base station, alongside breadcrumbing of wireless nodes by our legged robots. We report results from the CERBERUS system-of-systems deployment at the DARPA Subterranean Challenge Tunnel and Urban Circuits, along with the current limitations and the lessons learned for the benefit of the community.

ROSep 21, 2021
Panoptic Multi-TSDFs: a Flexible Representation for Online Multi-resolution Volumetric Mapping and Long-term Dynamic Scene Consistency

Lukas Schmid, Jeffrey Delmerico, Johannes Schönberger et al.

For robotic interaction in environments shared with other agents, access to volumetric and semantic maps of the scene is crucial. However, such environments are inevitably subject to long-term changes, which the map needs to account for. We thus propose panoptic multi-TSDFs as a novel representation for multi-resolution volumetric mapping in changing environments. By leveraging high-level information for 3D reconstruction, our proposed system allocates high resolution only where needed. Through reasoning on the object level, semantic consistency over time is achieved. This enables our method to maintain up-to-date reconstructions with high accuracy while improving coverage by incorporating previous data. We show in thorough experimental evaluation that our map can be efficiently constructed, maintained, and queried during online operation, and that the presented approach can operate robustly on real depth sensors using non-optimized panoptic segmentation as input.

CVSep 20, 2021
Superquadric Object Representation for Optimization-based Semantic SLAM

Florian Tschopp, Juan Nieto, Roland Siegwart et al.

Introducing semantically meaningful objects to visual Simultaneous Localization And Mapping (SLAM) has the potential to improve both the accuracy and reliability of pose estimates, especially in challenging scenarios with significant view-point and appearance changes. However, how semantic objects should be represented for an efficient inclusion in optimization-based SLAM frameworks is still an open question. Superquadrics(SQs) are an efficient and compact object representation, able to represent most common object types to a high degree, and typically retrieved from 3D point-cloud data. However, accurate 3D point-cloud data might not be available in all applications. Recent advancements in machine learning enabled robust object recognition and semantic mask measurements from camera images under many different appearance conditions. We propose a pipeline to leverage such semantic mask measurements to fit SQ parameters to multi-view camera observations using a multi-stage initialization and optimization procedure. We demonstrate the system's ability to retrieve randomly generated SQ parameters from multi-view mask observations in preliminary simulation experiments and evaluate different initialization stages and cost functions.

ROJul 30, 2021
SemSegMap- 3D Segment-Based Semantic Localization

Andrei Cramariuc, Florian Tschopp, Nikhilesh Alatur et al.

Localization is an essential task for mobile autonomous robotic systems that want to use pre-existing maps or create new ones in the context of SLAM. Today, many robotic platforms are equipped with high-accuracy 3D LiDAR sensors, which allow a geometric mapping, and cameras able to provide semantic cues of the environment. Segment-based mapping and localization have been applied with great success to 3D point-cloud data, while semantic understanding has been shown to improve localization performance in vision based systems. In this paper we combine both modalities in SemSegMap, extending SegMap into a segment based mapping framework able to also leverage color and semantic data from the environment to improve localization accuracy and robustness. In particular, we present new segmentation and descriptor extraction processes. The segmentation process benefits from additional distance information from color and semantic class consistency resulting in more repeatable segments and more overlap after re-visiting a place. For the descriptor, a tight fusion approach in a deep-learned descriptor extraction network is performed leading to a higher descriptiveness for landmark matching. We demonstrate the advantages of this fusion on multiple simulated and real-world datasets and compare its performance to various baselines. We show that we are able to find 50.9% more high-accuracy prior-less global localizations compared to SegMap on challenging datasets using very compact maps while also providing accurate full 6 DoF pose estimates in real-time.

CVMay 16, 2021
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

Margarita Grinvald, Federico Tombari, Roland Siegwart et al.

The ability to simultaneously track and reconstruct multiple objects moving in the scene is of the utmost importance for robotic tasks such as autonomous navigation and interaction. Virtually all of the previous attempts to map multiple dynamic objects have evolved to store individual objects in separate reconstruction volumes and track the relative pose between them. While simple and intuitive, such formulation does not scale well with respect to the number of objects in the scene and introduces the need for an explicit occlusion handling strategy. In contrast, we propose a map representation that allows maintaining a single volume for the entire scene and all the objects therein. To this end, we introduce a novel multi-object TSDF formulation that can encode multiple object surfaces at any given location in the map. In a multiple dynamic object tracking and reconstruction scenario, our representation allows maintaining accurate reconstruction of surfaces even while they become temporarily occluded by other objects moving in their proximity. We evaluate the proposed TSDF++ formulation on a public synthetic dataset and demonstrate its ability to preserve reconstructions of occluded surfaces when compared to the standard TSDF map representation.

ROApr 29, 2021
Crowd against the machine: A simulation-based benchmark tool to evaluate and compare robot capabilities to navigate a human crowd

Fabien Grzeskowiak, David Gonon, Daniel Dugas et al.

The evaluation of robot capabilities to navigate human crowds is essential to conceive new robots intended to operate in public spaces. This paper initiates the development of a benchmark tool to evaluate such capabilities; our long term vision is to provide the community with a simulation tool that generates virtual crowded environment to test robots, to establish standard scenarios and metrics to evaluate navigation techniques in terms of safety and efficiency, and thus, to install new methods to benchmarking robots' crowd navigation capabilities. This paper presents the architecture of the simulation tools, introduces first scenarios and evaluation metrics, as well as early results to demonstrate that our solution is relevant to be used as a benchmark tool.

ROApr 17, 2021
Spherical Multi-Modal Place Recognition for Heterogeneous Sensor Systems

Lukas Bernreiter, Lionel Ott, Juan Nieto et al.

In this paper, we propose a robust end-to-end multi-modal pipeline for place recognition where the sensor systems can differ from the map building to the query. Our approach operates directly on images and LiDAR scans without requiring any local feature extraction modules. By projecting the sensor data onto the unit sphere, we learn a multi-modal descriptor of partially overlapping scenes using a spherical convolutional neural network. The employed spherical projection model enables the support of arbitrary LiDAR and camera systems readily without losing information. Loop closure candidates are found using a nearest-neighbor lookup in the embedding space. We tackle the problem of correctly identifying the closest place by correlating the candidates' power spectra, obtaining a confidence value per prospect. Our estimate for the correct place corresponds then to the candidate with the highest confidence. We evaluate our proposal w.r.t. state-of-the-art approaches in place recognition using real-world data acquired using different sensors. Our approach can achieve a recall that is up to 10% and 5% higher than for a LiDAR- and vision-based system, respectively, when the sensor setup differs between model training and deployment. Additionally, our place selection can correctly identify up to 95% matches from the candidate set.

ROFeb 20, 2021
Mesh Manifold based Riemannian Motion Planning for Omnidirectional Micro Aerial Vehicles

Michael Pantic, Lionel Ott, Cesar Cadena et al.

This paper presents a novel on-line path planning method that enables aerial robots to interact with surfaces. We present a solution to the problem of finding trajectories that drive a robot towards a surface and move along it. Triangular meshes are used as a surface map representation that is free of fixed discretization and allows for very large workspaces. We propose to leverage planar parametrization methods to obtain a lower-dimensional topologically equivalent representation of the original surface. Furthermore, we interpret the original surface and its lower-dimensional representation as manifold approximations that allow the use of Riemannian Motion Policies (RMPs), resulting in an efficient, versatile, and elegant motion generation framework. We compare against several Rapidly-exploring Random Tree (RRT) planners, a customized CHOMP variant, and the discrete geodesic algorithm. Using extensive simulations on real-world data we show that the proposed planner can reliably plan high-quality near-optimal trajectories at minimal computational cost. The accompanying multimedia attachment demonstrates feasibility on a real OMAV. The obtained paths show less than 10% deviation from the theoretical optimum while facilitating reactive re-planning at kHz refresh rates, enabling flying robots to perform motion planning for interaction with complex surfaces.

ROFeb 16, 2021
Hough2Map -- Iterative Event-based Hough Transform for High-Speed Railway Mapping

Florian Tschopp, Cornelius von Einem, Andrei Cramariuc et al.

To cope with the growing demand for transportation on the railway system, accurate, robust, and high-frequency positioning is required to enable a safe and efficient utilization of the existing railway infrastructure. As a basis for a localization system we propose a complete on-board mapping pipeline able to map robust meaningful landmarks, such as poles from power lines, in the vicinity of the vehicle. Such poles are good candidates for reliable and long term landmarks even through difficult weather conditions or seasonal changes. To address the challenges of motion blur and illumination changes in railway scenarios we employ a Dynamic Vision Sensor, a novel event-based camera. Using a sideways oriented on-board camera, poles appear as vertical lines. To map such lines in a real-time event stream, we introduce Hough2Map, a novel consecutive iterative event-based Hough transform framework capable of detecting, tracking, and triangulating close-by structures. We demonstrate the mapping reliability and accuracy of Hough2Map on real-world data in typical usage scenarios and evaluate using surveyed infrastructure ground truth maps. Hough2Map achieves a detection reliability of up to 92% and a mapping root mean square error accuracy of 1.1518m.

ROFeb 3, 2021
PHASER: a Robust and Correspondence-free Global Pointcloud Registration

Lukas Bernreiter, Lionel Ott, Juan Nieto et al.

We propose PHASER, a correspondence-free global registration of sensor-centric pointclouds that is robust to noise, sparsity, and partial overlaps. Our method can seamlessly handle multimodal information and does not rely on keypoint nor descriptor preprocessing modules. By exploiting properties of Fourier analysis, PHASER operates directly on the sensor's signal, fusing the spectra of multiple channels and computing the 6-DoF transformation based on correlation. Our registration pipeline starts by finding the most likely rotation followed by computing the most likely translation. Both estimates are distributed according to a probability distribution that takes the underlying manifold into account, i.e., a Bingham and Gaussian distribution, respectively. This further allows our approach to consider the periodic-nature of rotations and naturally represent its uncertainty. We extensively compare PHASER against several well-known registration algorithms on both simulated datasets, and real-world data acquired using different sensor configurations. Our results show that PHASER can globally align pointclouds in less than 100ms with an average accuracy of 2cm and 0.5deg, is resilient against noise, and can handle partial overlap.

ROJan 20, 2021
Active Model Learning using Informative Trajectories for Improved Closed-Loop Control on Real Robots

Weixuan Zhang, Marco Tognon, Lionel Ott et al.

Model-based controllers on real robots require accurate knowledge of the system dynamics to perform optimally. For complex dynamics, first-principles modeling is not sufficiently precise, and data-driven approaches can be leveraged to learn a statistical model from real experiments. However, the efficient and effective data collection for such a data-driven system on real robots is still an open challenge. This paper introduces an optimization problem formulation to find an informative trajectory that allows for efficient data collection and model learning. We present a sampling-based method that computes an approximation of the trajectory that minimizes the prediction uncertainty of the dynamics model. This trajectory is then executed, collecting the data to update the learned model. In experiments we demonstrate the capabilities of our proposed framework when applied to a complex omnidirectional flying vehicle with tiltable rotors. Using our informative trajectories results in models which outperform models obtained from non-informative trajectory by 13.3\% with the same amount of training data. Furthermore, we show that the model learned from informative trajectories generalizes better than the one learned from non-informative trajectories, achieving better tracking performance on different tasks.

RODec 8, 2020
Multiple Hypothesis Semantic Mapping for Robust Data Association

Lukas Bernreiter, Abel Gawel, Hannes Sommer et al.

In this paper, we present a semantic mapping approach with multiple hypothesis tracking for data association. As semantic information has the potential to overcome ambiguity in measurements and place recognition, it forms an eminent modality for autonomous systems. This is particularly evident in urban scenarios with several similar looking surroundings. Nevertheless, it requires the handling of a non-Gaussian and discrete random variable coming from object detectors. Previous methods facilitate semantic information for global localization and data association to reduce the instance ambiguity between the landmarks. However, many of these approaches do not deal with the creation of complete globally consistent representations of the environment and typically do not scale well. We utilize multiple hypothesis trees to derive a probabilistic data association for semantic measurements by means of position, instance and class to create a semantic representation. We propose an optimized mapping method and make use of a pose graph to derive a novel semantic SLAM solution. Furthermore, we show that semantic covisibility graphs allow for a precise place recognition in urban environments. We verify our approach using real-world outdoor dataset and demonstrate an average drift reduction of 33 % w.r.t. the raw odometry source. Moreover, our approach produces 55 % less hypotheses on average than a regular multiple hypotheses approach.

RODec 8, 2020
NavRep: Unsupervised Representations for Reinforcement Learning of Robot Navigation in Dynamic Human Environments

Daniel Dugas, Juan Nieto, Roland Siegwart et al.

Robot navigation is a task where reinforcement learning approaches are still unable to compete with traditional path planning. State-of-the-art methods differ in small ways, and do not all provide reproducible, openly available implementations. This makes comparing methods a challenge. Recent research has shown that unsupervised learning methods can scale impressively, and be leveraged to solve difficult problems. In this work, we design ways in which unsupervised learning can be used to assist reinforcement learning for robot navigation. We train two end-to-end, and 18 unsupervised-learning-based architectures, and compare them, along with existing approaches, in unseen test cases. We demonstrate our approach working on a real life robot. Our results show that unsupervised learning methods are competitive with end-to-end methods. We also highlight the importance of various components such as input representation, predictive unsupervised learning, and latent features. We make all our models publicly available, as well as training and testing environments, and tools. This release also includes OpenAI-gym-compatible environments designed to emulate the training conditions described by other papers, with as much fidelity as possible. Our hope is that this helps in bringing together the field of RL for robot navigation, and allows meaningful comparisons across state-of-the-art methods.

CVNov 3, 2020
Out-of-Distribution Detection for Automotive Perception

Julia Nitsch, Masha Itkina, Ransalu Senanayake et al.

Neural networks (NNs) are widely used for object classification in autonomous driving. However, NNs can fail on input data not well represented by the training dataset, known as out-of-distribution (OOD) data. A mechanism to detect OOD samples is important for safety-critical applications, such as automotive perception, to trigger a safe fallback mode. NNs often rely on softmax normalization for confidence estimation, which can lead to high confidences being assigned to OOD samples, thus hindering the detection of failures. This paper presents a method for determining whether inputs are OOD, which does not require OOD data during training and does not increase the computational cost of inference. The latter property is especially important in automotive applications with limited computational resources and real-time constraints. Our proposed approach outperforms state-of-the-art methods on real-world automotive datasets.

ROOct 20, 2020
Automatic Extension of a Symbolic Mobile Manipulation Skill Set

Julian Förster, Lionel Ott, Juan Nieto et al.

Symbolic planning can provide an intuitive interface for non-expert users to operate autonomous robots by abstracting away much of the low-level programming. However, symbolic planners assume that the initially provided abstract domain and problem descriptions are closed and complete. This means that they are fundamentally unable to adapt to changes in the environment or task that are not captured by the initial description. We propose a method that allows an agent to automatically extend its skill set, and thus the abstract description, upon encountering such a situation. We introduce strategies for generalizing from previous experience, completing sequences of key actions and discovering preconditions to ensure the efficiency of our skill sequence exploration scheme. The resulting system is evaluated in simulation on object rearrangement tasks. Compared to a Monte Carlo Tree Search baseline, our strategies for efficient search have on average a 29% higher success rate at a 68% faster runtime.

ROOct 19, 2020
Freetures: Localization in Signed Distance Function Maps

Alexander Millane, Helen Oleynikova, Christian Lanegger et al.

Localization of a robotic system within a previously mapped environment is important for reducing estimation drift and for reusing previously built maps. Existing techniques for geometry-based localization have focused on the description of local surface geometry, usually using pointclouds as the underlying representation. We propose a system for geometry-based localization that extracts features directly from an implicit surface representation: the Signed Distance Function (SDF). The SDF varies continuously through space, which allows the proposed system to extract and utilize features describing both surfaces and free-space. Through evaluations on public datasets, we demonstrate the flexibility of this approach, and show an increase in localization performance over state-of-the-art handcrafted surfaces-only descriptors. We achieve an average improvement of ~12% on an RGB-D dataset and ~18% on a LiDAR-based dataset. Finally, we demonstrate our system for localizing a LiDAR-equipped MAV within a previously built map of a search and rescue training ground.

ROAug 13, 2020
IDOL: A Framework for IMU-DVS Odometry using Lines

Cedric Le Gentil, Florian Tschopp, Ignacio Alzugaray et al.

In this paper, we introduce IDOL, an optimization-based framework for IMU-DVS Odometry using Lines. Event cameras, also called Dynamic Vision Sensors (DVSs), generate highly asynchronous streams of events triggered upon illumination changes for each individual pixel. This novel paradigm presents advantages in low illumination conditions and high-speed motions. Nonetheless, this unconventional sensing modality brings new challenges to perform scene reconstruction or motion estimation. The proposed method offers to leverage a continuous-time representation of the inertial readings to associate each event with timely accurate inertial data. The method's front-end extracts event clusters that belong to line segments in the environment whereas the back-end estimates the system's trajectory alongside the lines' 3D position by minimizing point-to-line distances between individual events and the lines' projection in the image space. A novel attraction/repulsion mechanism is presented to accurately estimate the lines' extremities, avoiding their explicit detection in the event data. The proposed method is benchmarked against a state-of-the-art frame-based visual-inertial odometry framework using public datasets. The results show that IDOL performs at the same order of magnitude on most datasets and even shows better orientation estimates. These findings can have a great impact on new algorithms for DVS.

ROJun 23, 2020
Learning dynamics for improving control of overactuated flying systems

Weixuan Zhang, Maximilian Brunner, Lionel Ott et al.

Overactuated omnidirectional flying vehicles are capable of generating force and torque in any direction, which is important for applications such as contact-based industrial inspection. This comes at the price of an increase in model complexity. These vehicles usually have non-negligible, repetitive dynamics that are hard to model, such as the aerodynamic interference between the propellers. This makes it difficult for high-performance trajectory tracking using a model-based controller. This paper presents an approach that combines a data-driven and a first-principle model for the system actuation and uses it to improve the controller. In a first step, the first-principle model errors are learned offline using a Gaussian Process (GP) regressor. At runtime, the first-principle model and the GP regressor are used jointly to obtain control commands. This is formulated as an optimization problem, which avoids ambiguous solutions present in a standard inverse model in overactuated systems, by only using forward models. The approach is validated using a tilt-arm overactuated omnidirectional flying vehicle performing attitude trajectory tracking. The results show that with our proposed method, the attitude trajectory error is reduced by 32% on average as compared to a nominal PID controller.

ROApr 2, 2020
Go Fetch: Mobile Manipulation in Unstructured Environments

Kenneth Blomqvist, Michel Breyer, Andrei Cramariuc et al.

With humankind facing new and increasingly large-scale challenges in the medical and domestic spheres, automation of the service sector carries a tremendous potential for improved efficiency, quality, and safety of operations. Mobile robotics can offer solutions with a high degree of mobility and dexterity, however these complex systems require a multitude of heterogeneous components to be carefully integrated into one consistent framework. This work presents a mobile manipulation system that combines perception, localization, navigation, motion planning and grasping skills into one common workflow for fetch and carry applications in unstructured indoor environments. The tight integration across the various modules is experimentally demonstrated on the task of finding a commonly available object in an office environment, grasping it, and delivering it to a desired drop-off location. The accompanying video is available at https://youtu.be/e89_Xg1sLnY.

ROMar 20, 2020
Active Interaction Force Control for Contact-Based Inspection with a Fully Actuated Aerial Vehicle

Karen Bodie, Maximilian Brunner, Michael Pantic et al.

This paper presents and validates active interaction force control and planning for fully actuated and omnidirectional aerial manipulation platforms, with the goal of aerial contact inspection in unstructured environments. We present a variable axis-selective impedance control which integrates direct force control for intentional interaction, using feedback from an on-board force sensor. The control approach aims to reject disturbances in free flight, while handling unintentional interaction, and actively controlling desired interaction forces. A fully actuated and omnidirectional tilt-rotor aerial system is used to show capabilities of the control and planning methods. Experiments demonstrate disturbance rejection, push-and-slide interaction, and force controlled interaction in different flight orientations. The system is validated as a tool for non-destructive testing of concrete infrastructure, and statistical results of

ROMar 20, 2020
Design and optimal control of a tiltrotor micro aerial vehicle for efficient omnidirectional flight

Mike Allenspach, Karen Bodie, Maximilian Brunner et al.

Omnidirectional micro aerial vehicles are a growing field of research, with demonstrated advantages for aerial interaction and uninhibited observation. While systems with complete pose omnidirectionality and high hover efficiency have been developed independently, a robust system that combines the two has not been demonstrated to date. This paper presents the design and optimal control of a novel omnidirectional vehicle that can exert a wrench in any orientation while maintaining efficient flight configurations. The system design is motivated by the result of a morphology design optimization. A six degrees of freedom optimal controller is derived, with an actuator allocation approach that implements task prioritization, and is robust to singularities. Flight experiments demonstrate and verify the system's capabilities.

ROMar 2, 2020
MOZARD: Multi-Modal Localization for Autonomous Vehicles in Urban Outdoor Environments

Lukas Schaupp, Patrick Pfreundschuh, Mathias Buerki et al.

Visually poor scenarios are one of the main sources of failure in visual localization systems in outdoor environments. To address this challenge, we present MOZARD, a multi-modal localization system for urban outdoor environments using vision and LiDAR. By extending our preexisting key-point based visual multi-session local localization approach with the use of semantic data, an improved localization recall can be achieved across vastly different appearance conditions. In particular we focus on the use of curbstone information because of their broad distribution and reliability within urban environments. We present thorough experimental evaluations on several driving kilometers in challenging urban outdoor environments, analyze the recall and accuracy of our localization system and demonstrate in a case study possible failure cases of each subsystem. We demonstrate that MOZARD is able to bridge scenarios where our previous work VIZARD fails, hence yielding an increased recall performance, while a similar localization accuracy of 0.2m is achieved

ROFeb 25, 2020
Whole-Body Control of a Mobile Manipulator using End-to-End Reinforcement Learning

Julien Kindle, Fadri Furrer, Tonci Novkovic et al.

Mobile manipulation is usually achieved by sequentially executing base and manipulator movements. This simplification, however, leads to a loss in efficiency and in some cases a reduction of workspace size. Even though different methods have been proposed to solve Whole-Body Control (WBC) online, they are either limited by a kinematic model or do not allow for reactive, online obstacle avoidance. In order to overcome these drawbacks, in this work, we propose an end-to-end Reinforcement Learning (RL) approach to WBC. We compared our learned controller against a state-of-the-art sampling-based method in simulation and achieved faster overall mission times. In addition, we validated the learned policy on our mobile manipulator RoyalPanda in challenging narrow corridor environments.

RONov 18, 2019
Object Finding in Cluttered Scenes Using Interactive Perception

Tonci Novkovic, Remi Pautrat, Fadri Furrer et al.

Object finding in clutter is a skill that requires perception of the environment and in many cases physical interaction. In robotics, interactive perception defines a set of algorithms that leverage actions to improve the perception of the environment, and vice versa use perception to guide the next action. Scene interactions are difficult to model, therefore, most of the current systems use predefined heuristics. This limits their ability to efficiently search for the target object in a complex environment. In order to remove heuristics and the need for explicit models of the interactions, in this work we propose a reinforcement learning based active and interactive perception system for scene exploration and object search. We evaluate our work both in simulated and in real-world experiments using a robotic manipulator equipped with an RGB and a depth camera, and compare our system to two baselines. The results indicate that our approach, trained in simulation only, transfers smoothly to reality and can solve the object finding task efficiently and with more than 88% success rate.

RONov 8, 2019
Building an Aerial-Ground Robotics System for Precision Farming: An Adaptable Solution

Alberto Pretto, Stéphanie Aravecchia, Wolfram Burgard et al.

The application of autonomous robots in agriculture is gaining increasing popularity thanks to the high impact it may have on food security, sustainability, resource use efficiency, reduction of chemical treatments, and the optimization of human effort and yield. With this vision, the Flourish research project aimed to develop an adaptable robotic solution for precision farming that combines the aerial survey capabilities of small autonomous unmanned aerial vehicles (UAVs) with targeted intervention performed by multi-purpose unmanned ground vehicles (UGVs). This paper presents an overview of the scientific and technological advances and outcomes obtained in the project. We introduce multi-spectral perception algorithms and aerial and ground-based systems developed for monitoring crop density, weed pressure, crop nitrogen nutrition status, and to accurately classify and locate weeds. We then introduce the navigation and mapping systems tailored to our robots in the agricultural environment, as well as the modules for collaborative mapping. We finally present the ground intervention hardware, software solutions, and interfaces we implemented and tested in different field conditions and with different crops. We describe a real use case in which a UAV collaborates with a UGV to monitor the field and to perform selective spraying without human intervention.

ROSep 20, 2019
An Efficient Sampling-based Method for Online Informative Path Planning in Unknown Environments

Lukas Schmid, Michael Pantic, Raghav Khanna et al.

The ability to plan informative paths online is essential to robot autonomy. In particular, sampling-based approaches are often used as they are capable of using arbitrary information gain formulations. However, they are prone to local minima, resulting in sub-optimal trajectories, and sometimes do not reach global coverage. In this paper, we present a new RRT*-inspired online informative path planning algorithm. Our method continuously expands a single tree of candidate trajectories and rewires segments to maintain the tree and refine intermediate trajectories. This allows the algorithm to achieve global coverage and maximize the utility of a path in a global context, using a single objective function. We demonstrate the algorithm's capabilities in the applications of autonomous indoor exploration as well as accurate Truncated Signed Distance Field (TSDF)-based 3D reconstruction on-board a Micro Aerial vehicle (MAV). We study the impact of commonly used information gain and cost formulations in these scenarios and propose a novel TSDF-based 3D reconstruction gain and cost-utility formulation. Detailed evaluation in realistic simulation environments show that our approach outperforms state of the art methods in these tasks. Experiments on a real MAV demonstrate the ability of our method to robustly plan in real-time, exploring an indoor environment solely with on-board sensing and computation. We make our framework available for future research.

ROAug 23, 2019
Flexible Trinocular: Non-rigid Multi-Camera-IMU Dense Reconstruction for UAV Navigation and Mapping

Timo Hinzmann, Cesar Cadena, Juan Nieto et al.

In this paper, we propose a visual-inertial framework able to efficiently estimate the camera poses of a non-rigid trinocular baseline for long-range depth estimation on-board a fast moving aerial platform. The estimation of the time-varying baseline is based on relative inertial measurements, a photometric relative pose optimizer, and a probabilistic wing model fused in an efficient Extended Kalman Filter (EKF) formulation. The estimated depth measurements can be integrated into a geo-referenced global map to render a reconstruction of the environment useful for local replanning algorithms. Based on extensive real-world experiments we describe the challenges and solutions for obtaining the probabilistic wing model, reliable relative inertial measurements, and vision-based relative pose updates and demonstrate the computational efficiency and robustness of the overall system under challenging conditions.

ROAug 5, 2019
Free-Space Features: Global Localization in 2D Laser SLAM Using Distance Function Maps

Alexander Millane, Helen Oleynikova, Juan Nieto et al.

In many applications, maintaining a consistent map of the environment is key to enabling robotic platforms to perform higher-level decision making. Detection of already visited locations is one of the primary ways in which map consistency is maintained, especially in situations where external positioning systems are unavailable or unreliable. Mapping in 2D is an important field in robotics, largely due to the fact that man-made environments such as warehouses and homes, where robots are expected to play an increasing role, can often be approximated as planar. Place recognition in this context remains challenging: 2D lidar scans contain scant information with which to characterize, and therefore recognize, a location. This paper introduces a novel approach aimed at addressing this problem. At its core, the system relies on the use of the distance function for representation of geometry. This representation allows extraction of features which describe the geometry of both surfaces and free-space in the environment. We propose a feature for this purpose. Through evaluations on public datasets, we demonstrate the utility of free-space in the description of places, and show an increase in localization performance over a state-of-the-art descriptor extracted from surface geometry.

ROMay 9, 2019
An Omnidirectional Aerial Manipulation Platform for Contact-Based Inspection

Karen Bodie, Maximilian Brunner, Michael Pantic et al.

This paper presents an omnidirectional aerial manipulation platform for robust and responsive interaction with unstructured environments, toward the goal of contact-based inspection. The fully actuated tilt-rotor aerial system is equipped with a rigidly mounted end-effector, and is able to exert a 6 degree of freedom force and torque, decoupling the system's translational and rotational dynamics, and enabling precise interaction with the environment while maintaining stability. An impedance controller with selective apparent inertia is formulated to permit compliance in certain degrees of freedom while achieving precise trajectory tracking and disturbance rejection in others. Experiments demonstrate disturbance rejection, push-and-slide interaction, and on-board state estimation with depth servoing to interact with local surfaces. The system is also validated as a tool for contact-based non-destructive testing of concrete infrastructure.

CVApr 5, 2019
The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation

Hermann Blum, Paul-Edouard Sarlin, Juan Nieto et al.

Deep learning has enabled impressive progress in the accuracy of semantic segmentation. Yet, the ability to estimate uncertainty and detect failure is key for safety-critical applications like autonomous driving. Existing uncertainty estimates have mostly been evaluated on simple tasks, and it is unclear whether these methods generalize to more complex scenarios. We present Fishyscapes, the first public benchmark for uncertainty estimation in a real-world task of semantic segmentation for urban driving. It evaluates pixel-wise uncertainty estimates towards the detection of anomalous objects in front of the vehicle. We~adapt state-of-the-art methods to recent semantic segmentation models and compare approaches based on softmax confidence, Bayesian learning, and embedding density. Our results show that anomaly detection is far from solved even for ordinary situations, while our benchmark allows measuring advancements beyond the state-of-the-art.

ROApr 1, 2019
Experimental Comparison of Visual-Aided Odometry Methods for Rail Vehicles

Florian Tschopp, Thomas Schneider, Andrew W. Palmer et al.

Today, rail vehicle localization is based on infrastructure-side Balises (beacons) together with on-board odometry to determine whether a rail segment is occupied. Such a coarse locking leads to a sub-optimal usage of the rail networks. New railway standards propose the use of moving blocks centered around the rail vehicles to increase the capacity of the network. However, this approach requires accurate and robust position and velocity estimation of all vehicles. In this work, we investigate the applicability, challenges and limitations of current visual and visual-inertial motion estimation frameworks for rail applications. An evaluation against RTK-GPS ground truth is performed on multiple datasets recorded in industrial, sub-urban, and forest environments. Our results show that stereo visual-inertial odometry has a great potential to provide a precise motion estimation because of its complementing sensor modalities and shows superior performance in challenging situations compared to other frameworks.

ROMar 1, 2019
Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery

Margarita Grinvald, Fadri Furrer, Tonci Novkovic et al.

To autonomously navigate and plan interactions in real-world environments, robots require the ability to robustly perceive and map complex, unstructured surrounding scenes. Besides building an internal representation of the observed scene geometry, the key insight toward a truly functional understanding of the environment is the usage of higher-level entities during mapping, such as individual object instances. We propose an approach to incrementally build volumetric object-centric maps during online scanning with a localized RGB-D camera. First, a per-frame segmentation scheme combines an unsupervised geometric approach with instance-aware semantic object predictions. This allows us to detect and segment elements both from the set of known classes and from other, previously unseen categories. Next, a data association step tracks the predicted instances across the different frames. Finally, a map integration strategy fuses information about their 3D shape, location, and, if available, semantic class into a global volume. Evaluation on a publicly available dataset shows that the proposed approach for building instance-level semantic maps is competitive with state-of-the-art methods, while additionally able to discover objects of unseen categories. The system is further evaluated within a real-world robotic mapping setup, for which qualitative results highlight the online nature of the method.

ROFeb 25, 2019
Informative Path Planning for Active Field Mapping under Localization Uncertainty

Marija Popovic, Teresa Vidal-Calleja, Jen Jen Chung et al.

Information gathering algorithms play a key role in unlocking the potential of robots for efficient data collection in a wide range of applications. However, most existing strategies neglect the fundamental problem of the robot pose uncertainty, which is an implicit requirement for creating robust, high-quality maps. To address this issue, we introduce an informative planning framework for active mapping that explicitly accounts for the pose uncertainty in both the mapping and planning tasks. Our strategy exploits a Gaussian Process (GP) model to capture a target environmental field given the uncertainty on its inputs. For planning, we formulate a new utility function that couples the localization and field mapping objectives in GP-based mapping scenarios in a principled way, without relying on any manually tuned parameters. Extensive simulations show that our approach outperforms existing strategies, with reductions in mean pose uncertainty and map error. We also present a proof of concept in an indoor temperature mapping scenario.

ROFeb 12, 2019
VIZARD: Reliable Visual Localization for Autonomous Vehicles in Urban Outdoor Environments

Mathias Bürki, Lukas Schaupp, Marcin Dymczyk et al.

Changes in appearance is one of the main sources of failure in visual localization systems in outdoor environments. To address this challenge, we present VIZARD, a visual localization system for urban outdoor environments. By combining a local localization algorithm with the use of multi-session maps, a high localization recall can be achieved across vastly different appearance conditions. The fusion of the visual localization constraints with wheel-odometry in a state estimation framework further guarantees smooth and accurate pose estimates. In an extensive experimental evaluation on several hundreds of driving kilometers in challenging urban outdoor environments, we analyze the recall and accuracy of our localization system, investigate its key parameters and boundary conditions, and compare different types of feature descriptors. Our results show that VIZARD is able to achieve nearly 100% recall with a localization accuracy below 0.5m under varying outdoor appearance conditions, including at night-time.

ROJan 22, 2019
Observability-aware Self-Calibration of Visual and Inertial Sensors for Ego-Motion Estimation

Thomas Schneider, Mingyang Li, Cesar Cadena et al.

External effects such as shocks and temperature variations affect the calibration of visual-inertial sensor systems and thus they cannot fully rely on factory calibrations. Re-calibrations performed on short user-collected datasets might yield poor performance since the observability of certain parameters is highly dependent on the motion. Additionally, on resource-constrained systems (e.g mobile phones), full-batch approaches over longer sessions quickly become prohibitively expensive. In this paper, we approach the self-calibration problem by introducing information theoretic metrics to assess the information content of trajectory segments, thus allowing to select the most informative parts from a dataset for calibration purposes. With this approach, we are able to build compact calibration datasets either: (a) by selecting segments from a long session with limited exciting motion or (b) from multiple short sessions where a single sessions does not necessarily excite all modes sufficiently. Real-world experiments in four different environments show that the proposed method achieves comparable performance to a batch calibration approach, yet, at a constant computational complexity which is independent of the duration of the session.

ROSep 30, 2018
AgriColMap: Aerial-Ground Collaborative 3D Mapping for Precision Farming

Ciro Potena, Raghav Khanna, Juan Nieto et al.

The combination of aerial survey capabilities of Unmanned Aerial Vehicles with targeted intervention abilities of agricultural Unmanned Ground Vehicles can significantly improve the effectiveness of robotic systems applied to precision agriculture. In this context, building and updating a common map of the field is an essential but challenging task. The maps built using robots of different types show differences in size, resolution and scale, the associated geolocation data may be inaccurate and biased, while the repetitiveness of both visual appearance and geometric structures found within agricultural contexts render classical map merging techniques ineffective. In this paper we propose AgriColMap, a novel map registration pipeline that leverages a grid-based multimodal environment representation which includes a vegetation index map and a Digital Surface Model. We cast the data association problem between maps built from UAVs and UGVs as a multimodal, large displacement dense optical flow estimation. The dominant, coherent flows, selected using a voting scheme, are used as point-to-point correspondences to infer a preliminary non-rigid alignment between the maps. A final refinement is then performed, by exploiting only meaningful parts of the registered maps. We evaluate our system using real world data for 3 fields with different crop species. The results show that our method outperforms several state of the art map registration and matching techniques by a large margin, and has a higher tolerance to large initial misalignments. We release an implementation of the proposed approach along with the acquired datasets with this paper.

ROSep 8, 2018
An informative path planning framework for UAV-based terrain monitoring

Marija Popovic, Teresa Vidal-Calleja, Gregory Hitz et al.

Unmanned Aerial Vehicles (UAVs) represent a new frontier in a wide range of monitoring and research applications. To fully leverage their potential, a key challenge is planning missions for efficient data acquisition in complex environments. To address this issue, this article introduces a general Informative Path Planning (IPP) framework for monitoring scenarios using an aerial robot, focusing on problems in which the value of sensor information is unevenly distributed in a target area and unknown a priori . The approach is capable of learning and focusing on regions of interest via adaptation to map either discrete or continuous variables on the terrain using variable-resolution data received from probabilistic sensors. During a mission, the terrain maps built online are used to plan information-rich trajectories in continuous 3-D space by optimizing initial solutions obtained by a coarse grid search. Extensive simulations show that our approach is more efficient than existing methods. We also demonstrate its real-time application on a photorealistic mapping scenario using a publicly available dataset and demonstrate a proof of concept for an agricultural monitoring task.

ROAug 8, 2018
Map Management for Efficient Long-Term Visual Localization in Outdoor Environments

Mathias Bürki, Marcin Dymczyk, Igor Gilitschenski et al.

We present a complete map management process for a visual localization system designed for multi-vehicle long- term operations in resource constrained outdoor environments. Outdoor visual localization generates large amounts of data that need to be incorporated into a lifelong visual map in order to allow localization at all times and under all appearance conditions. Processing these large quantities of data is non- trivial, as it is subject to limited computational and storage capabilities both on the vehicle and on the mapping backend. We address this problem with a two-fold map update paradigm capable of, either, adding new visual cues to the map, or updating co-observation statistics. The former, in combination with offline map summarization techniques, allows enhancing the appearance coverage of the lifelong map while keeping the map size limited. On the other hand, the latter is able to significantly boost the appearance-based landmark selection for efficient online localization without incurring any additional computational or storage burden. Our evaluation in challenging outdoor conditions shows that our proposed map management process allows building and maintaining maps for precise visual localization over long time spans in a tractable and scalable fashion.