Lionel Ott

RO
h-index29
46papers
5,083citations
Novelty54%
AI Score51

46 Papers

ROJun 28, 2022
Learning Variable Impedance Control for Aerial Sliding on Uneven Heterogeneous Surfaces by Proprioceptive and Tactile Sensing

Weixuan Zhang, Lionel Ott, Marco Tognon et al.

The recent development of novel aerial vehicles capable of physically interacting with the environment leads to new applications such as contact-based inspection. These tasks require the robotic system to exchange forces with partially-known environments, which may contain uncertainties including unknown spatially-varying friction properties and discontinuous variations of the surface geometry. Finding a control strategy that is robust against these environmental uncertainties remains an open challenge. This paper presents a learning-based adaptive control strategy for aerial sliding tasks. In particular, the gains of a standard impedance controller are adjusted in real-time by a policy based on the current control signals, proprioceptive measurements, and tactile sensing. This policy is trained in simulation with simplified actuator dynamics in a student-teacher learning setup. The real-world performance of the proposed approach is verified using a tilt-arm omnidirectional flying vehicle. The proposed controller structure combines data-driven and model-based control methods, enabling our approach to successfully transfer directly and without adaptation from simulation to the real platform. Compared to fine-tuned state of the art interaction control methods we achieve reduced tracking error and improved disturbance rejection.

ROMay 3, 2022
Sampling-free obstacle gradients and reactive planning in Neural Radiance Fields (NeRF)

Michael Pantic, Cesar Cadena, Roland Siegwart et al.

This work investigates the use of Neural implicit representations, specifically Neural Radiance Fields (NeRF), for geometrical queries and motion planning. We show that by adding the capacity to infer occupancy in a radius to a pre-trained NeRF, we are effectively learning an approximation to a Euclidean Signed Distance Field (ESDF). Using backward differentiation of the augmented network, we obtain an obstacle gradient that is integrated into an obstacle avoidance policy based on the Riemannian Motion Policies (RMP) framework. Thus, our findings allow for very fast sampling-free obstacle avoidance planning in the implicit representation.

ROMar 20, 2023
Neural Implicit Vision-Language Feature Fields

Kenneth Blomqvist, Francesco Milano, Jen Jen Chung et al.

Recently, groundbreaking results have been presented on open-vocabulary semantic image segmentation. Such methods segment each pixel in an image into arbitrary categories provided at run-time in the form of text prompts, as opposed to a fixed set of classes defined at training time. In this work, we present a zero-shot volumetric open-vocabulary semantic scene segmentation method. Our method builds on the insight that we can fuse image features from a vision-language model into a neural implicit representation. We show that the resulting feature field can be segmented into different classes by assigning points to natural language text prompts. The implicit volumetric representation enables us to segment the scene both in 3D and 2D by rendering feature maps from any given viewpoint of the scene. We show that our method works on noisy real-world data and can run in real-time on live sensor data dynamically adjusting to text prompts. We also present quantitative comparisons on the ScanNet dataset.

CVSep 26, 2022
Baking in the Feature: Accelerating Volumetric Segmentation by Rendering Feature Maps

Kenneth Blomqvist, Lionel Ott, Jen Jen Chung et al.

Methods have recently been proposed that densely segment 3D volumes into classes using only color images and expert supervision in the form of sparse semantically annotated pixels. While impressive, these methods still require a relatively large amount of supervision and segmenting an object can take several minutes in practice. Such systems typically only optimize their representation on the particular scene they are fitting, without leveraging any prior information from previously seen images. In this paper, we propose to use features extracted with models trained on large existing datasets to improve segmentation performance. We bake this feature representation into a Neural Radiance Field (NeRF) by volumetrically rendering feature maps and supervising on features extracted from each input image. We show that by baking this representation into the NeRF, we make the subsequent classification task much easier. Our experiments show that our method achieves higher segmentation accuracy with fewer semantic annotations than existing methods over a wide range of scenes.

ROFeb 24Code
Efficient Hierarchical Any-Angle Path Planning on Multi-Resolution 3D Grids

Victor Reijgwart, Cesar Cadena, Roland Siegwart et al.

Hierarchical, multi-resolution volumetric mapping approaches are widely used to represent large and complex environments as they can efficiently capture their occupancy and connectivity information. Yet widely used path planning methods such as sampling and trajectory optimization do not exploit this explicit connectivity information, and search-based methods such as A* suffer from scalability issues in large-scale high-resolution maps. In many applications, Euclidean shortest paths form the underpinning of the navigation system. For such applications, any-angle planning methods, which find optimal paths by connecting corners of obstacles with straight-line segments, provide a simple and efficient solution. In this paper, we present a method that has the optimality and completeness properties of any-angle planners while overcoming computational tractability issues common to search-based methods by exploiting multi-resolution representations. Extensive experiments on real and synthetic environments demonstrate the proposed approach's solution quality and speed, outperforming even sampling-based methods. The framework is open-sourced to allow the robotics and planning community to build on our research.

ROJul 13, 2023
Self-Supervised Learning for Interactive Perception of Surgical Thread for Autonomous Suture Tail-Shortening

Vincent Schorp, Will Panitch, Kaushik Shivakumar et al.

Accurate 3D sensing of suturing thread is a challenging problem in automated surgical suturing because of the high state-space complexity, thinness and deformability of the thread, and possibility of occlusion by the grippers and tissue. In this work we present a method for tracking surgical thread in 3D which is robust to occlusions and complex thread configurations, and apply it to autonomously perform the surgical suture "tail-shortening" task: pulling thread through tissue until a desired "tail" length remains exposed. The method utilizes a learned 2D surgical thread detection network to segment suturing thread in RGB images. It then identifies the thread path in 2D and reconstructs the thread in 3D as a NURBS spline by triangulating the detections from two stereo cameras. Once a 3D thread model is initialized, the method tracks the thread across subsequent frames. Experiments suggest the method achieves a 1.33 pixel average reprojection error on challenging single-frame 3D thread reconstructions, and an 0.84 pixel average reprojection error on two tracking sequences. On the tail-shortening task, it accomplishes a 90% success rate across 20 trials. Supplemental materials are available at https://sites.google.com/berkeley.edu/autolab-surgical-thread/ .

CVJul 16, 2024Code
NeuSurfEmb: A Complete Pipeline for Dense Correspondence-based 6D Object Pose Estimation without CAD Models

Francesco Milano, Jen Jen Chung, Hermann Blum et al.

State-of-the-art approaches for 6D object pose estimation assume the availability of CAD models and require the user to manually set up physically-based rendering (PBR) pipelines for synthetic training data generation. Both factors limit the application of these methods in real-world scenarios. In this work, we present a pipeline that does not require CAD models and allows training a state-of-the-art pose estimator requiring only a small set of real images as input. Our method is based on a NeuS2 object representation, that we learn through a semi-automated procedure based on Structure-from-Motion (SfM) and object-agnostic segmentation. We exploit the novel-view synthesis ability of NeuS2 and simple cut-and-paste augmentation to automatically generate photorealistic object renderings, which we use to train the correspondence-based SurfEmb pose estimator. We evaluate our method on the LINEMOD-Occlusion dataset, extensively studying the impact of its individual components and showing competitive performance with respect to approaches based on CAD models and PBR data. We additionally demonstrate the ease of use and effectiveness of our pipeline on self-collected real-world objects, showing that our method outperforms state-of-the-art CAD-model-free approaches, with better accuracy and robustness to mild occlusions. To allow the robotics community to benefit from this system, we will publicly release it at https://www.github.com/ethz-asl/neusurfemb.

ROJan 4, 2021Code
Volumetric Grasping Network: Real-time 6 DOF Grasp Detection in Clutter

Michel Breyer, Jen Jen Chung, Lionel Ott et al.

General robot grasping in clutter requires the ability to synthesize grasps that work for previously unseen objects and that are also robust to physical interactions, such as collisions with other objects in the scene. In this work, we design and train a network that predicts 6 DOF grasps from 3D scene information gathered from an on-board sensor such as a wrist-mounted depth camera. Our proposed Volumetric Grasping Network (VGN) accepts a Truncated Signed Distance Function (TSDF) representation of the scene and directly outputs the predicted grasp quality and the associated gripper orientation and opening width for each voxel in the queried 3D volume. We show that our approach can plan grasps in only 10 ms and is able to clear 92% of the objects in real-world clutter removal experiments without the need for explicit collision checking. The real-time capability opens up the possibility for closed-loop grasp planning, allowing robots to handle disturbances, recover from errors and provide increased robustness. Code is available at https://github.com/ethz-asl/vgn.

ROOct 19, 2020Code
A Unified Approach for Autonomous Volumetric Exploration of Large Scale Environments under Severe Odometry Drift

Lukas Schmid, Victor Reijgwart, Lionel Ott et al.

Exploration is a fundamental problem in robot autonomy. A major limitation, however, is that during exploration robots oftentimes have to rely on on-board systems alone for state estimation, accumulating significant drift over time in large environments. Drift can be detrimental to robot safety and exploration performance. In this work, a submap-based, multi-layer approach for both mapping and planning is proposed to enable safe and efficient volumetric exploration of large scale environments despite odometry drift. The central idea of our approach combines local (temporally and spatially) and global mapping to guarantee safety and efficiency. Similarly, our planning approach leverages the presented map to compute global volumetric frontiers in a changing global map and utilizes the nature of exploration dealing with partial information for efficient local and global planning. The presented system is thoroughly evaluated and shown to outperform state of the art methods even under drift-free conditions. Our system, termed GLoca}, will be made available open source.

CVMar 21, 2024
Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation

Francesco Di Felice, Alberto Remus, Stefano Gasperini et al.

Estimating the pose of objects through vision is essential to make robotic platforms interact with the environment. Yet, it presents many challenges, often related to the lack of flexibility and generalizability of state-of-the-art solutions. Diffusion models are a cutting-edge neural architecture transforming 2D and 3D computer vision, outlining remarkable performances in zero-shot novel-view synthesis. Such a use case is particularly intriguing for reconstructing 3D objects. However, localizing objects in unstructured environments is rather unexplored. To this end, this work presents Zero123-6D, the first work to demonstrate the utility of Diffusion Model-based novel-view-synthesizers in enhancing RGB 6D pose estimation at category-level, by integrating them with feature extraction techniques. Novel View Synthesis allows to obtain a coarse pose that is refined through an online optimization method introduced in this work to deal with intra-category geometric differences. In such a way, the outlined method shows reduction in data requirements, removal of the necessity of depth information in zero-shot category-level 6D pose estimation task, and increased performance, quantitatively demonstrated through experiments on the CO3D dataset.

CVOct 13, 2025
Towards Fast and Scalable Normal Integration using Continuous Components

Francesco Milano, Jen Jen Chung, Lionel Ott et al.

Surface normal integration is a fundamental problem in computer vision, dealing with the objective of reconstructing a surface from its corresponding normal map. Existing approaches require an iterative global optimization to jointly estimate the depth of each pixel, which scales poorly to larger normal maps. In this paper, we address this problem by recasting normal integration as the estimation of relative scales of continuous components. By constraining pixels belonging to the same component to jointly vary their scale, we drastically reduce the number of optimization variables. Our framework includes a heuristic to accurately estimate continuous components from the start, a strategy to rebalance optimization terms, and a technique to iteratively merge components to further reduce the size of the problem. Our method achieves state-of-the-art results on the standard normal integration benchmark in as little as a few seconds and achieves one-order-of-magnitude speedup over pixel-level approaches on large-resolution normal maps.

ROJul 7, 2025
CueLearner: Bootstrapping and local policy adaptation from relative feedback

Giulio Schiavi, Andrei Cramariuc, Lionel Ott et al.

Human guidance has emerged as a powerful tool for enhancing reinforcement learning (RL). However, conventional forms of guidance such as demonstrations or binary scalar feedback can be challenging to collect or have low information content, motivating the exploration of other forms of human input. Among these, relative feedback (i.e., feedback on how to improve an action, such as "more to the left") offers a good balance between usability and information richness. Previous research has shown that relative feedback can be used to enhance policy search methods. However, these efforts have been limited to specific policy classes and use feedback inefficiently. In this work, we introduce a novel method to learn from relative feedback and combine it with off-policy reinforcement learning. Through evaluations on two sparse-reward tasks, we demonstrate our method can be used to improve the sample efficiency of reinforcement learning by guiding its exploration process. Additionally, we show it can adapt a policy to changes in the environment or the user's preferences. Finally, we demonstrate real-world applicability by employing our approach to learn a navigation policy in a sparse reward setting.

ROFeb 9, 2022
Stein Particle Filter for Nonlinear, Non-Gaussian State Estimation

Fahira Afzal Maken, Fabio Ramos, Lionel Ott

Estimation of a dynamical system's latent state subject to sensor noise and model inaccuracies remains a critical yet difficult problem in robotics. While Kalman filters provide the optimal solution in the least squared sense for linear and Gaussian noise problems, the general nonlinear and non-Gaussian noise case is significantly more complicated, typically relying on sampling strategies that are limited to low-dimensional state spaces. In this paper we devise a general inference procedure for filtering of nonlinear, non-Gaussian dynamical systems that exploits the differentiability of both the update and prediction models to scale to higher dimensional spaces. Our method, Stein particle filter, can be seen as a deterministic flow of particles, embedded in a reproducing kernel Hilbert space, from an initial state to the desirable posterior. The particles evolve jointly to conform to a posterior approximation while interacting with each other through a repulsive force. We evaluate the method in simulation and in complex localization tasks while comparing it to sequential Monte Carlo solutions.

CVJan 19, 2022
Semi-automatic 3D Object Keypoint Annotation and Detection for the Masses

Kenneth Blomqvist, Jen Jen Chung, Lionel Ott et al.

Creating computer vision datasets requires careful planning and lots of time and effort. In robotics research, we often have to use standardized objects, such as the YCB object set, for tasks such as object tracking, pose estimation, grasping and manipulation, as there are datasets and pre-learned methods available for these objects. This limits the impact of our research since learning-based computer vision methods can only be used in scenarios that are supported by existing datasets. In this work, we present a full object keypoint tracking toolkit, encompassing the entire process from data collection, labeling, model learning and evaluation. We present a semi-automatic way of collecting and labeling datasets using a wrist mounted camera on a standard robotic arm. Using our toolkit and method, we are able to obtain a working 3D object keypoint detector and go through the whole process of data collection, annotation and learning in just a couple hours of active time.

RONov 11, 2021
Autonomous Teamed Exploration of Subterranean Environments using Legged and Aerial Robots

Mihir Kulkarni, Mihir Dharmadhikari, Marco Tranzatto et al.

This paper presents a novel strategy for autonomous teamed exploration of subterranean environments using legged and aerial robots. Tailored to the fact that subterranean settings, such as cave networks and underground mines, often involve complex, large-scale and multi-branched topologies, while wireless communication within them can be particularly challenging, this work is structured around the synergy of an onboard exploration path planner that allows for resilient long-term autonomy, and a multi-robot coordination framework. The onboard path planner is unified across legged and flying robots and enables navigation in environments with steep slopes, and diverse geometries. When a communication link is available, each robot of the team shares submaps to a centralized location where a multi-robot coordination framework identifies global frontiers of the exploration space to inform each system about where it should re-position to best continue its mission. The strategy is verified through a field deployment inside an underground mine in Switzerland using a legged and a flying robot collectively exploring for 45 min, as well as a longer simulation study with three systems.

ROJul 9, 2021
Probabilistic Trajectory Prediction with Structural Constraints

Weiming Zhi, Lionel Ott, Fabio Ramos

This work addresses the problem of predicting the motion trajectories of dynamic objects in the environment. Recent advances in predicting motion patterns often rely on machine learning techniques to extrapolate motion patterns from observed trajectories, with no mechanism to directly incorporate known rules. We propose a novel framework, which combines probabilistic learning and constrained trajectory optimisation. The learning component of our framework provides a distribution over future motion trajectories conditioned on observed past coordinates. This distribution is then used as a prior to a constrained optimisation problem which enforces chance constraints on the trajectory distribution. This results in constraint-compliant trajectory distributions which closely resemble the prior. In particular, we focus our investigation on collision constraints, such that extrapolated future trajectory distributions conform to the environment structure. We empirically demonstrate on real-world and simulated datasets the ability of our framework to learn complex probabilistic motion trajectories for motion data, while directly enforcing constraints to improve generalisability, producing more robust and higher quality trajectory distributions.

LGJul 4, 2021
Learning ODEs via Diffeomorphisms for Fast and Robust Integration

Weiming Zhi, Tin Lai, Lionel Ott et al.

Advances in differentiable numerical integrators have enabled the use of gradient descent techniques to learn ordinary differential equations (ODEs). In the context of machine learning, differentiable solvers are central for Neural ODEs (NODEs), a class of deep learning models with continuous depth, rather than discrete layers. However, these integrators can be unsatisfactorily slow and inaccurate when learning systems of ODEs from long sequences, or when solutions of the system vary at widely different timescales in each dimension. In this paper we propose an alternative approach to learning ODEs from data: we represent the underlying ODE as a vector field that is related to another base vector field by a differentiable bijection, modelled by an invertible neural network. By restricting the base ODE to be amenable to integration, we can drastically speed up and improve the robustness of integration. We demonstrate the efficacy of our method in training and evaluating continuous neural networks models, as well as in learning benchmark ODE systems. We observe improvements of up to two orders of magnitude when integrating learned ODEs with GPUs computation.

ROJun 7, 2021
Stein ICP for Uncertainty Estimation in Point Cloud Matching

Fahira Afzal Maken, Fabio Ramos, Lionel Ott

Quantification of uncertainty in point cloud matching is critical in many tasks such as pose estimation, sensor fusion, and grasping. Iterative closest point (ICP) is a commonly used pose estimation algorithm which provides a point estimate of the transformation between two point clouds. There are many sources of uncertainty in this process that may arise due to sensor noise, ambiguous environment, and occlusion. However, for safety critical problems such as autonomous driving, a point estimate of the pose transformation is not sufficient as it does not provide information about the multiple solutions. Current probabilistic ICP methods usually do not capture all sources of uncertainty and may provide unreliable transformation estimates which can have a detrimental effect in state estimation or decision making tasks that use this information. In this work we propose a new algorithm to align two point clouds that can precisely estimate the uncertainty of ICP's transformation parameters. We develop a Stein variational inference framework with gradient based optimization of ICP's cost function. The method provides a non-parametric estimate of the transformation, can model complex multi-modal distributions, and can be effectively parallelized on a GPU. Experiments using 3D kinect data as well as sparse indoor/outdoor LiDAR data show that our method is capable of efficiently producing accurate pose uncertainty estimates.

ROApr 17, 2021
Spherical Multi-Modal Place Recognition for Heterogeneous Sensor Systems

Lukas Bernreiter, Lionel Ott, Juan Nieto et al.

In this paper, we propose a robust end-to-end multi-modal pipeline for place recognition where the sensor systems can differ from the map building to the query. Our approach operates directly on images and LiDAR scans without requiring any local feature extraction modules. By projecting the sensor data onto the unit sphere, we learn a multi-modal descriptor of partially overlapping scenes using a spherical convolutional neural network. The employed spherical projection model enables the support of arbitrary LiDAR and camera systems readily without losing information. Loop closure candidates are found using a nearest-neighbor lookup in the embedding space. We tackle the problem of correctly identifying the closest place by correlating the candidates' power spectra, obtaining a confidence value per prospect. Our estimate for the correct place corresponds then to the candidate with the highest confidence. We evaluate our proposal w.r.t. state-of-the-art approaches in place recognition using real-world data acquired using different sensors. Our approach can achieve a recall that is up to 10% and 5% higher than for a LiDAR- and vision-based system, respectively, when the sensor setup differs between model training and deployment. Additionally, our place selection can correctly identify up to 95% matches from the candidate set.

ROFeb 20, 2021
Mesh Manifold based Riemannian Motion Planning for Omnidirectional Micro Aerial Vehicles

Michael Pantic, Lionel Ott, Cesar Cadena et al.

This paper presents a novel on-line path planning method that enables aerial robots to interact with surfaces. We present a solution to the problem of finding trajectories that drive a robot towards a surface and move along it. Triangular meshes are used as a surface map representation that is free of fixed discretization and allows for very large workspaces. We propose to leverage planar parametrization methods to obtain a lower-dimensional topologically equivalent representation of the original surface. Furthermore, we interpret the original surface and its lower-dimensional representation as manifold approximations that allow the use of Riemannian Motion Policies (RMPs), resulting in an efficient, versatile, and elegant motion generation framework. We compare against several Rapidly-exploring Random Tree (RRT) planners, a customized CHOMP variant, and the discrete geodesic algorithm. Using extensive simulations on real-world data we show that the proposed planner can reliably plan high-quality near-optimal trajectories at minimal computational cost. The accompanying multimedia attachment demonstrates feasibility on a real OMAV. The obtained paths show less than 10% deviation from the theoretical optimum while facilitating reactive re-planning at kHz refresh rates, enabling flying robots to perform motion planning for interaction with complex surfaces.

ROFeb 3, 2021
PHASER: a Robust and Correspondence-free Global Pointcloud Registration

Lukas Bernreiter, Lionel Ott, Juan Nieto et al.

We propose PHASER, a correspondence-free global registration of sensor-centric pointclouds that is robust to noise, sparsity, and partial overlaps. Our method can seamlessly handle multimodal information and does not rely on keypoint nor descriptor preprocessing modules. By exploiting properties of Fourier analysis, PHASER operates directly on the sensor's signal, fusing the spectra of multiple channels and computing the 6-DoF transformation based on correlation. Our registration pipeline starts by finding the most likely rotation followed by computing the most likely translation. Both estimates are distributed according to a probability distribution that takes the underlying manifold into account, i.e., a Bingham and Gaussian distribution, respectively. This further allows our approach to consider the periodic-nature of rotations and naturally represent its uncertainty. We extensively compare PHASER against several well-known registration algorithms on both simulated datasets, and real-world data acquired using different sensor configurations. Our results show that PHASER can globally align pointclouds in less than 100ms with an average accuracy of 2cm and 0.5deg, is resilient against noise, and can handle partial overlap.

ROJan 20, 2021
Active Model Learning using Informative Trajectories for Improved Closed-Loop Control on Real Robots

Weixuan Zhang, Marco Tognon, Lionel Ott et al.

Model-based controllers on real robots require accurate knowledge of the system dynamics to perform optimally. For complex dynamics, first-principles modeling is not sufficiently precise, and data-driven approaches can be leveraged to learn a statistical model from real experiments. However, the efficient and effective data collection for such a data-driven system on real robots is still an open challenge. This paper introduces an optimization problem formulation to find an informative trajectory that allows for efficient data collection and model learning. We present a sampling-based method that computes an approximation of the trajectory that minimizes the prediction uncertainty of the dynamics model. This trajectory is then executed, collecting the data to update the learned model. In experiments we demonstrate the capabilities of our proposed framework when applied to a complex omnidirectional flying vehicle with tiltable rotors. Using our informative trajectories results in models which outperform models obtained from non-informative trajectory by 13.3\% with the same amount of training data. Furthermore, we show that the model learned from informative trajectories generalizes better than the one learned from non-informative trajectories, achieving better tracking performance on different tasks.

RONov 12, 2020
Anticipatory Navigation in Crowds by Probabilistic Prediction of Pedestrian Future Movements

Weiming Zhi, Tin Lai, Lionel Ott et al.

Critical for the coexistence of humans and robots in dynamic environments is the capability for agents to understand each other's actions, and anticipate their movements. This paper presents Stochastic Process Anticipatory Navigation (SPAN), a framework that enables nonholonomic robots to navigate in environments with crowds, while anticipating and accounting for the motion patterns of pedestrians. To this end, we learn a predictive model to predict continuous-time stochastic processes to model future movement of pedestrians. Anticipated pedestrian positions are used to conduct chance constrained collision-checking, and are incorporated into a time-to-collision control problem. An occupancy map is also integrated to allow for probabilistic collision-checking with static obstacles. We demonstrate the capability of SPAN in crowded simulation environments, as well as with a real-world pedestrian dataset.

ROOct 20, 2020
Automatic Extension of a Symbolic Mobile Manipulation Skill Set

Julian Förster, Lionel Ott, Juan Nieto et al.

Symbolic planning can provide an intuitive interface for non-expert users to operate autonomous robots by abstracting away much of the low-level programming. However, symbolic planners assume that the initially provided abstract domain and problem descriptions are closed and complete. This means that they are fundamentally unable to adapt to changes in the environment or task that are not captured by the initial description. We propose a method that allows an agent to automatically extend its skill set, and thus the abstract description, upon encountering such a situation. We introduce strategies for generalizing from previous experience, completing sequences of key actions and discovering preconditions to ensure the efficiency of our skill sequence exploration scheme. The resulting system is evaluated in simulation on object rearrangement tasks. Compared to a Monte Carlo Tree Search baseline, our strategies for efficient search have on average a 29% higher success rate at a 68% faster runtime.

ROJun 23, 2020
Learning dynamics for improving control of overactuated flying systems

Weixuan Zhang, Maximilian Brunner, Lionel Ott et al.

Overactuated omnidirectional flying vehicles are capable of generating force and torque in any direction, which is important for applications such as contact-based industrial inspection. This comes at the price of an increase in model complexity. These vehicles usually have non-negligible, repetitive dynamics that are hard to model, such as the aerodynamic interference between the propellers. This makes it difficult for high-performance trajectory tracking using a model-based controller. This paper presents an approach that combines a data-driven and a first-principle model for the system actuation and uses it to improve the controller. In a first step, the first-principle model errors are learned offline using a Gaussian Process (GP) regressor. At runtime, the first-principle model and the GP regressor are used jointly to obtain control commands. This is formulated as an optimization problem, which avoids ambiguous solutions present in a standard inverse model in overactuated systems, by only using forward models. The approach is validated using a tilt-arm overactuated omnidirectional flying vehicle performing attitude trajectory tracking. The results show that with our proposed method, the attitude trajectory error is reduced by 32% on average as compared to a nominal PID controller.

ROApr 16, 2020
Estimating Motion Uncertainty with Bayesian ICP

Fahira Afzal Maken, Fabio Ramos, Lionel Ott

Accurate uncertainty estimation associated with the pose transformation between two 3D point clouds is critical for autonomous navigation, grasping, and data fusion. Iterative closest point (ICP) is widely used to estimate the transformation between point cloud pairs by iteratively performing data association and motion estimation. Despite its success and popularity, ICP is effectively a deterministic algorithm, and attempts to reformulate it in a probabilistic manner generally do not capture all sources of uncertainty, such as data association errors and sensor noise. This leads to overconfident transformation estimates, potentially compromising the robustness of systems relying on them. In this paper we propose a novel method to estimate pose uncertainty in ICP with a Markov Chain Monte Carlo (MCMC) algorithm. Our method combines recent developments in optimization for scalable Bayesian sampling such as stochastic gradient Langevin dynamics (SGLD) to infer a full posterior distribution of the pose transformation between two point clouds. We evaluate our method, called Bayesian ICP, in experiments using 3D Kinect data demonstrating that our method is capable of both quickly and accuractely estimating pose uncertainty, taking into account data association uncertainty as reflected by the shape of the objects.

ROApr 2, 2020
Go Fetch: Mobile Manipulation in Unstructured Environments

Kenneth Blomqvist, Michel Breyer, Andrei Cramariuc et al.

With humankind facing new and increasingly large-scale challenges in the medical and domestic spheres, automation of the service sector carries a tremendous potential for improved efficiency, quality, and safety of operations. Mobile robotics can offer solutions with a high degree of mobility and dexterity, however these complex systems require a multitude of heterogeneous components to be carefully integrated into one consistent framework. This work presents a mobile manipulation system that combines perception, localization, navigation, motion planning and grasping skills into one common workflow for fetch and carry applications in unstructured indoor environments. The tight integration across the various modules is experimentally demonstrated on the task of finding a commonly available object in an office environment, grasping it, and delivering it to a desired drop-off location. The accompanying video is available at https://youtu.be/e89_Xg1sLnY.

ROFeb 18, 2020
DISCO: Double Likelihood-free Inference Stochastic Control

Lucas Barcelos, Rafael Oliveira, Rafael Possas et al.

Accurate simulation of complex physical systems enables the development, testing, and certification of control strategies before they are deployed into the real systems. As simulators become more advanced, the analytical tractability of the differential equations and associated numerical solvers incorporated in the simulations diminishes, making them difficult to analyse. A potential solution is the use of probabilistic inference to assess the uncertainty of the simulation parameters given real observations of the system. Unfortunately the likelihood function required for inference is generally expensive to compute or totally intractable. In this paper we propose to leverage the power of modern simulators and recent techniques in Bayesian statistics for likelihood-free inference to design a control framework that is efficient and robust with respect to the uncertainty over simulation parameters. The posterior distribution over simulation parameters is propagated through a potentially non-analytical model of the system with the unscented transform, and a variant of the information theoretical model predictive control. This approach provides a more efficient way to evaluate trajectory roll outs than Monte Carlo sampling, reducing the online computation burden. Experiments show that the controller proposed attained superior performance and robustness on classical control and robotics tasks when compared to models not accounting for the uncertainty over model parameters.

LGNov 20, 2019
Bayesian Curiosity for Efficient Exploration in Reinforcement Learning

Tom Blau, Lionel Ott, Fabio Ramos

Balancing exploration and exploitation is a fundamental part of reinforcement learning, yet most state-of-the-art algorithms use a naive exploration protocol like $ε$-greedy. This contributes to the problem of high sample complexity, as the algorithm wastes effort by repeatedly visiting parts of the state space that have already been explored. We introduce a novel method based on Bayesian linear regression and latent space embedding to generate an intrinsic reward signal that encourages the learning agent to seek out unexplored parts of the state space. This method is computationally efficient, simple to implement, and can extend any state-of-the-art reinforcement learning algorithm. We evaluate the method on a range of algorithms and challenging control tasks, on both simulated and physical robots, demonstrating how the proposed method can significantly improve sample complexity.

ROSep 25, 2019
OCTNet: Trajectory Generation in New Environments from Past Experiences

Weiming Zhi, Tin Lai, Lionel Ott et al.

Being able to safely operate for extended periods of time in dynamic environments is a critical capability for autonomous systems. This generally involves the prediction and understanding of motion patterns of dynamic entities, such as vehicles and people, in the surroundings. Many motion prediction methods in the literature can learn a function, mapping position and time to potential trajectories taken by people or other dynamic entities. However, these predictions depend only on previously observed trajectories, and do not explicitly take into consideration the environment. Trends of motion obtained in one environment are typically specific to that environment, and are not used to better predict motion in other environments. In this paper, we address the problem of generating likely motion dynamics conditioned on the environment, represented as an occupancy map. We introduce the Occupancy Conditional Trajectory Network (OCTNet) framework, capable of generalising the previously observed motion in known environments, to generate trajectories in new environments where no observations of motion has not been observed. OCTNet encodes trajectories as a fixed-sized vector of parameters and utilises neural networks to learn conditional distributions over parameters. We empirically demonstrate our method's ability to generate complex multi-modal trajectory patterns in different environments.

ROSep 20, 2019
An Efficient Sampling-based Method for Online Informative Path Planning in Unknown Environments

Lukas Schmid, Michael Pantic, Raghav Khanna et al.

The ability to plan informative paths online is essential to robot autonomy. In particular, sampling-based approaches are often used as they are capable of using arbitrary information gain formulations. However, they are prone to local minima, resulting in sub-optimal trajectories, and sometimes do not reach global coverage. In this paper, we present a new RRT*-inspired online informative path planning algorithm. Our method continuously expands a single tree of candidate trajectories and rewires segments to maintain the tree and refine intermediate trajectories. This allows the algorithm to achieve global coverage and maximize the utility of a path in a global context, using a single objective function. We demonstrate the algorithm's capabilities in the applications of autonomous indoor exploration as well as accurate Truncated Signed Distance Field (TSDF)-based 3D reconstruction on-board a Micro Aerial vehicle (MAV). We study the impact of commonly used information gain and cost formulations in these scenarios and propose a novel TSDF-based 3D reconstruction gain and cost-utility formulation. Detailed evaluation in realistic simulation environments show that our approach outperforms state of the art methods in these tasks. Experiments on a real MAV demonstrate the ability of our method to robustly plan in real-time, exploring an indoor environment solely with on-board sensing and computation. We make our framework available for future research.

ROJul 22, 2019
Speeding Up Iterative Closest Point Using Stochastic Gradient Descent

Fahira Afzal Maken, Fabio Ramos, Lionel Ott

Sensors producing 3D point clouds such as 3D laser scanners and RGB-D cameras are widely used in robotics, be it for autonomous driving or manipulation. Aligning point clouds produced by these sensors is a vital component in such applications to perform tasks such as model registration, pose estimation, and SLAM. Iterative closest point (ICP) is the most widely used method for this task, due to its simplicity and efficiency. In this paper we propose a novel method which solves the optimisation problem posed by ICP using stochastic gradient descent (SGD). Using SGD allows us to improve the convergence speed of ICP without sacrificing solution quality. Experiments using Kinect as well as Velodyne data show that, our proposed method is faster than existing methods, while obtaining solutions comparable to standard ICP. An additional benefit is robustness to parameters when processing data from different sensors.

ROJul 11, 2019
Kernel Trajectory Maps for Multi-Modal Probabilistic Motion Prediction

Weiming Zhi, Lionel Ott, Fabio Ramos

Understanding the dynamics of an environment, such as the movement of humans and vehicles, is crucial for agents to achieve long-term autonomy in urban environments. This requires the development of methods to capture the multi-modal and probabilistic nature of motion patterns. We present Kernel Trajectory Maps (KTM) to capture the trajectories of movement in an environment. KTMs leverage the expressiveness of kernels from non-parametric modelling by projecting input trajectories onto a set of representative trajectories, to condition on a sequence of observed waypoint coordinates, and predict a multi-modal distribution over possible future trajectories. The output is a mixture of continuous stochastic processes, where each realisation is a continuous functional trajectory, which can be queried at arbitrarily fine time steps.

AIJun 18, 2019
Learning to Plan Hierarchically from Curriculum

Philippe Morere, Lionel Ott, Fabio Ramos

We present a framework for learning to plan hierarchically in domains with unknown dynamics. We enhance planning performance by exploiting problem structure in several ways: (i) We simplify the search over plans by leveraging knowledge of skill objectives, (ii) Shorter plans are generated by enforcing aggressively hierarchical planning, (iii) We learn transition dynamics with sparse local models for better generalisation. Our framework decomposes transition dynamics into skill effects and success conditions, which allows fast planning by reasoning on effects, while learning conditions from interactions with the world. We propose a simple method for learning new abstract skills, using successful trajectories stemming from completing the goals of a curriculum. Learned skills are then refined to leverage other abstract skills and enhance subsequent planning. We show that both conditions and abstract skills can be learned simultaneously while planning, even in stochastic domains. Our method is validated in experiments of increasing complexity, with up to 2^100 states, showing superior planning to classic non-hierarchical planners or reinforcement learning methods. Applicability to real-world problems is demonstrated in a simulation-to-real transfer experiment on a robotic manipulator.

LGFeb 21, 2019
Bayesian optimisation under uncertain inputs

Rafael Oliveira, Lionel Ott, Fabio Ramos

Bayesian optimisation (BO) has been a successful approach to optimise functions which are expensive to evaluate and whose observations are noisy. Classical BO algorithms, however, do not account for errors about the location where observations are taken, which is a common issue in problems with physical components. In these cases, the estimation of the actual query location is also subject to uncertainty. In this context, we propose an upper confidence bound (UCB) algorithm for BO problems where both the outcome of a query and the true query location are uncertain. The algorithm employs a Gaussian process model that takes probability distributions as inputs. Theoretical results are provided for both the proposed algorithm and a conventional UCB approach within the uncertain-inputs setting. Finally, we evaluate each method's performance experimentally, comparing them to other input noise aware BO approaches on simulated scenarios involving synthetic and real data.

ROMay 3, 2018
Functional Path Optimisation for Exploration in Continuous Occupancy Maps

Gilad Francis, Lionel Ott, Fabio Ramos

Autonomous exploration is a complex task where the robot moves through an unknown environment with the goal of mapping it. The desired output of such a process is a sequence of paths that efficiently and safely minimise the uncertainty of the resulting map. However, optimising over the entire space of possible paths is computationally intractable. Therefore, most exploration methods relax the general problem by optimising a simpler one, for example finding the single next best view. In this work, we formulate exploration as a variational problem which allows us to directly optimise in the space of trajectories using functional gradient methods, searching for the Next Best Path (NBP). We take advantage of the recently introduced Hilbert maps to devise an information-based functional that can be computed in closed-form. The resulting trajectories are continuous and maximise safety as well as mutual information. In experiments we verify the ability of the proposed method to find smooth and safe paths and compare these results with other exploration methods.

ROFeb 17, 2018
Learning to Race through Coordinate Descent Bayesian Optimisation

Rafael Oliveira, Fernando H. M. Rocha, Lionel Ott et al.

In the automation of many kinds of processes, the observable outcome can often be described as the combined effect of an entire sequence of actions, or controls, applied throughout its execution. In these cases, strategies to optimise control policies for individual stages of the process might not be applicable, and instead the whole policy might have to be optimised at once. On the other hand, the cost to evaluate the policy's performance might also be high, being desirable that a solution can be found with as few interactions as possible with the real system. We consider the problem of optimising control policies to allow a robot to complete a given race track within a minimum amount of time. We assume that the robot has no prior information about the track or its own dynamical model, just an initial valid driving example. Localisation is only applied to monitor the robot and to provide an indication of its position along the track's centre axis. We propose a method for finding a policy that minimises the time per lap while keeping the vehicle on the track using a Bayesian optimisation (BO) approach over a reproducing kernel Hilbert space. We apply an algorithm to search more efficiently over high-dimensional policy-parameter spaces with BO, by iterating over each dimension individually, in a sequential coordinate descent-like scheme. Experiments demonstrate the performance of the algorithm against other methods in a simulated car racing environment.

RODec 14, 2017
Learning to Navigate by Growing Deep Networks

Thushan Ganegedara, Lionel Ott, Fabio Ramos

Adaptability is central to autonomy. Intuitively, for high-dimensional learning problems such as navigating based on vision, internal models with higher complexity allow to accurately encode the information available. However, most learning methods rely on models with a fixed structure and complexity. In this paper, we present a self-supervised framework for robots to learn to navigate, without any prior knowledge of the environment, by incrementally building the structure of a deep network as new data becomes available. Our framework captures images from a monocular camera and self labels the images to continuously train and predict actions from a computationally efficient adaptive deep architecture based on Autoencoders (AE), in a self-supervised fashion. The deep architecture, named Reinforced Adaptive Denoising Autoencoders (RA-DAE), uses reinforcement learning to dynamically change the network structure by adding or removing neurons. Experiments were conducted in simulation and real-world indoor and outdoor environments to assess the potential of self-supervised navigation. RA-DAE demonstrates better performance than equivalent non-adaptive deep learning alternatives and can continue to expand its knowledge, trading-off past and present information.

ROSep 7, 2017
Bayesian Optimisation for Safe Navigation under Localisation Uncertainty

Rafael Oliveira, Lionel Ott, Vitor Guizilini et al.

In outdoor environments, mobile robots are required to navigate through terrain with varying characteristics, some of which might significantly affect the integrity of the platform. Ideally, the robot should be able to identify areas that are safe for navigation based on its own percepts about the environment while avoiding damage to itself. Bayesian optimisation (BO) has been successfully applied to the task of learning a model of terrain traversability while guiding the robot through more traversable areas. An issue, however, is that localisation uncertainty can end up guiding the robot to unsafe areas and distort the model being learnt. In this paper, we address this problem and present a novel method that allows BO to consider localisation uncertainty by applying a Gaussian process model for uncertain inputs as a prior. We evaluate the proposed method in simulation and in experiments with a real robot navigating over rough terrain and compare it against standard BO methods.

ROMay 17, 2017
Stochastic Functional Gradient Path Planning in Occupancy Maps

Gilad Francis, Lionel Ott, Fabio Ramos

Planning safe paths is a major building block in robot autonomy. It has been an active field of research for several decades, with a plethora of planning methods. Planners can be generally categorised as either trajectory optimisers or sampling-based planners. The latter is the predominant planning paradigm for occupancy maps. Trajectory optimisation entails major algorithmic changes to tackle contextual information gaps caused by incomplete sensor coverage of the map. However, the benefits are substantial, as trajectory optimisers can reason on the trade-off between path safety and efficiency. In this work, we improve our previous work on stochastic functional gradient planners. We introduce a novel expressive path representation based on kernel approximation, that allows cost effective model updates based on stochastic samples. The main drawback of the previous stochastic functional gradient planner was the cubic cost, stemming from its non-parametric path representation. Our novel approximate kernel based model, on the other hand, has a fixed linear cost that depends solely on the number of features used to represent the path. We show that the stochasticity of the samples is crucial for the planner and present comparisons to other state-of-the-art planning methods in both simulation and with real occupancy data. The experiments demonstrate the advantages of the stochastic approximate kernel method for path planning in occupancy maps.

ROMar 1, 2017
Occupancy Map Building through Bayesian Exploration

Gilad Francis, Lionel Ott, Roman Marchant et al.

We propose a novel holistic approach for safe autonomous exploration and map building based on constrained Bayesian optimisation. This method finds optimal continuous paths instead of discrete sensing locations that inherently satisfy motion and safety constraints. Evaluating both the objective and constraints functions requires forward simulation of expected observations. As such evaluations are costly, the Bayesian optimiser proposes only paths which are likely to yield optimal results and satisfy the constraints with high confidence. By balancing the reward and risk associated with each path, the optimiser minimises the number of expensive function evaluations. We demonstrate the effectiveness of our approach in a series of experiments both in simulation and with a real ground robot and provide comparisons to other exploration techniques. Evidently, each method has its specific favourable conditions, where it outperforms all other techniques. Yet, by reasoning on the usefulness of the entire path instead of its end point, our method provides a robust and consistent performance through all tests and performs better than or as good as the other leading methods.

ROMar 1, 2017
Stochastic Functional Gradient for Motion Planning in Continuous Occupancy Maps

Gilad Francis, Lionel Ott, Fabio Ramos

Safe path planning is a crucial component in autonomous robotics. The many approaches to find a collision free path can be categorically divided into trajectory optimisers and sampling-based methods. When planning using occupancy maps, the sampling-based approach is the prevalent method. The main drawback of such techniques is that the reasoning about the expected cost of a plan is limited to the search heuristic used by each method. We introduce a novel planning method based on trajectory optimisation to plan safe and efficient paths in continuous occupancy maps. We extend the expressiveness of the state-of-the-art functional gradient optimisation methods by devising a stochastic gradient update rule to optimise a path represented as a Gaussian process. This approach avoids the need to commit to a specific resolution of the path representation, whether spatial or parametric. We utilise a continuous occupancy map representation in order to define our optimisation objective, which enables fast computation of occupancy gradients. We show that this approach is essential in order to ensure convergence to the optimal path, and present results and comparisons to other planning methods in both simulation and with real laser data. The experiments demonstrate the benefits of using this technique when planning for safe and efficient paths in continuous occupancy maps.

CVJan 7, 2017
Urban Scene Segmentation with Laser-Constrained CRFs

Charika De Alvis, Lionel Ott, Fabio Ramos

Robots typically possess sensors of different modalities, such as colour cameras, inertial measurement units, and 3D laser scanners. Often, solving a particular problem becomes easier when more than one modality is used. However, while there are undeniable benefits to combine sensors of different modalities the process tends to be complicated. Segmenting scenes observed by the robot into a discrete set of classes is a central requirement for autonomy as understanding the scene is the first step to reason about future situations. Scene segmentation is commonly performed using either image data or 3D point cloud data. In computer vision many successful methods for scene segmentation are based on conditional random fields (CRF) where the maximum a posteriori (MAP) solution to the segmentation can be obtained by inference. In this paper we devise a new CRF inference method for scene segmentation that incorporates global constraints, enforcing the sets of nodes are assigned the same class label. To do this efficiently, the CRF is formulated as a relaxed quadratic program whose MAP solution is found using a gradient-based optimisation approach. The proposed method is evaluated on images and 3D point cloud data gathered in urban environments where image data provides the appearance features needed by the CRF, while the 3D point cloud data provides global spatial constraints over sets of nodes. Comparisons with belief propagation, conventional quadratic programming relaxation, and higher order potential CRF show the benefits of the proposed method.

LGAug 8, 2016
Online Adaptation of Deep Architectures with Reinforcement Learning

Thushan Ganegedara, Lionel Ott, Fabio Ramos

Online learning has become crucial to many problems in machine learning. As more data is collected sequentially, quickly adapting to changes in the data distribution can offer several competitive advantages such as avoiding loss of prior knowledge and more efficient learning. However, adaptation to changes in the data distribution (also known as covariate shift) needs to be performed without compromising past knowledge already built in into the model to cope with voluminous and dynamic data. In this paper, we propose an online stacked Denoising Autoencoder whose structure is adapted through reinforcement learning. Our algorithm forces the network to exploit and explore favourable architectures employing an estimated utility function that maximises the accuracy of an unseen validation sequence. Different actions, such as Pool, Increment and Merge are available to modify the structure of the network. As we observe through a series of experiments, our approach is more responsive, robust, and principled than its counterparts for non-stationary as well as stationary data distributions. Experimental results indicate that our algorithm performs better at preserving gained prior knowledge and responding to changes in the data distribution.

CVFeb 2, 2016
Simple Online and Realtime Tracking

Alex Bewley, Zongyuan Ge, Lionel Ott et al.

This paper explores a pragmatic approach to multiple object tracking where the main focus is to associate objects efficiently for online and realtime applications. To this end, detection quality is identified as a key factor influencing tracking performance, where changing the detector can improve tracking by up to 18.9%. Despite only using a rudimentary combination of familiar techniques such as the Kalman Filter and Hungarian algorithm for the tracking components, this approach achieves an accuracy comparable to state-of-the-art online trackers. Furthermore, due to the simplicity of our tracking method, the tracker updates at a rate of 260 Hz which is over 20x faster than other state-of-the-art trackers.

LGMar 6, 2014
Integer Programming Relaxations for Integrated Clustering and Outlier Detection

Lionel Ott, Linsey Pang, Fabio Ramos et al.

In this paper we present methods for exemplar based clustering with outlier selection based on the facility location formulation. Given a distance function and the number of outliers to be found, the methods automatically determine the number of clusters and outliers. We formulate the problem as an integer program to which we present relaxations that allow for solutions that scale to large data sets. The advantages of combining clustering and outlier selection include: (i) the resulting clusters tend to be compact and semantically coherent (ii) the clusters are more robust against data perturbations and (iii) the outliers are contextualised by the clusters and more interpretable, i.e. it is easier to distinguish between outliers which are the result of data errors from those that may be indicative of a new pattern emergent in the data. We present and contrast three relaxations to the integer program formulation: (i) a linear programming formulation (LP) (ii) an extension of affinity propagation to outlier detection (APOC) and (iii) a Lagrangian duality based formulation (LD). Evaluation on synthetic as well as real data shows the quality and scalability of these different methods.