Robert Mahony

CV
h-index7
33papers
1,141citations
Novelty52%
AI Score52

33 Papers

SYJun 4
Tracking Control for a Dynamic Model of an Underwater Submersible

Matthew Hampsey, Pieter van Goor, Ravi Banavar et al.

Underwater vehicles are naturally modelled as rigid bodies on SE(3) subjected to added mass effects. The passivity of the Hamiltonian structure of the system can be exploited to design energy-based stabilising controllers, however, the extension of these control designs to tracking control is not trivial since the error system for the classical error formulations is not itself Hamiltonian. In this paper, we show that a novel choice of error function leads to error dynamics that are Hamiltonian. We go on to derive an energy-based tracking control for a fully coupled model of a submersible vehicle. Asymptotic convergence of the control scheme is proved and the control is demonstrated in a simulation study of the Blue Robotics BlueROV2 Heavy submersible.

CVMay 17, 2022
A Linear Comb Filter for Event Flicker Removal

Ziwei Wang, Dingran Yuan, Yonhon Ng et al.

Event cameras are bio-inspired sensors that capture per-pixel asynchronous intensity change rather than the synchronous absolute intensity frames captured by a classical camera sensor. Such cameras are ideal for robotics applications since they have high temporal resolution, high dynamic range and low latency. However, due to their high temporal resolution, event cameras are particularly sensitive to flicker such as from fluorescent or LED lights. During every cycle from bright to dark, pixels that image a flickering light source generate many events that provide little or no useful information for a robot, swamping the useful data in the scene. In this paper, we propose a novel linear filter to preprocess event data to remove unwanted flicker events from an event stream. The proposed algorithm achieves over 4.6 times relative improvement in the signal-to-noise ratio when compared to the raw event stream due to the effective removal of flicker from fluorescent lighting. Thus, it is ideally suited to robotics applications that operate in indoor settings or scenes illuminated by flickering light sources.

CVJul 20, 2023
Asynchronous Blob Tracker for Event Cameras

Ziwei Wang, Timothy Molloy, Pieter van Goor et al.

Event-based cameras are popular for tracking fast-moving objects due to their high temporal resolution, low latency, and high dynamic range. In this paper, we propose a novel algorithm for tracking event blobs using raw events asynchronously in real time. We introduce the concept of an event blob as a spatio-temporal likelihood of event occurrence where the conditional spatial likelihood is blob-like. Many real-world objects such as car headlights or any quickly moving foreground objects generate event blob data. The proposed algorithm uses a nearest neighbour classifier with a dynamic threshold criteria for data association coupled with an extended Kalman filter to track the event blob state. Our algorithm achieves highly accurate blob tracking, velocity estimation, and shape estimation even under challenging lighting conditions and high-speed motions (> 11000 pixels/s). The microsecond time resolution achieved means that the filter output can be used to derive secondary information such as time-to-contact or range estimation, that will enable applications to real-world problems such as collision avoidance in autonomous driving.

ROApr 18, 2023
GoferBot: A Visual Guided Human-Robot Collaborative Assembly System

Zheyu Zhuang, Yizhak Ben-Shabat, Jiahao Zhang et al.

The current transformation towards smart manufacturing has led to a growing demand for human-robot collaboration (HRC) in the manufacturing process. Perceiving and understanding the human co-worker's behaviour introduces challenges for collaborative robots to efficiently and effectively perform tasks in unstructured and dynamic environments. Integrating recent data-driven machine vision capabilities into HRC systems is a logical next step in addressing these challenges. However, in these cases, off-the-shelf components struggle due to generalisation limitations. Real-world evaluation is required in order to fully appreciate the maturity and robustness of these approaches. Furthermore, understanding the pure-vision aspects is a crucial first step before combining multiple modalities in order to understand the limitations. In this paper, we propose GoferBot, a novel vision-based semantic HRC system for a real-world assembly task. It is composed of a visual servoing module that reaches and grasps assembly parts in an unstructured multi-instance and dynamic environment, an action recognition module that performs human action prediction for implicit communication, and a visual handover module that uses the perceptual understanding of human behaviour to produce an intuitive and efficient collaborative assembly experience. GoferBot is a novel assembly system that seamlessly integrates all sub-modules by utilising implicit semantic information purely from visual perception.

CVMay 12, 2025Code
Asynchronous Multi-Object Tracking with an Event Camera

Angus Apps, Ziwei Wang, Vladimir Perejogin et al.

Events cameras are ideal sensors for enabling robots to detect and track objects in highly dynamic environments due to their low latency output, high temporal resolution, and high dynamic range. In this paper, we present the Asynchronous Event Multi-Object Tracking (AEMOT) algorithm for detecting and tracking multiple objects by processing individual raw events asynchronously. AEMOT detects salient event blob features by identifying regions of consistent optical flow using a novel Field of Active Flow Directions built from the Surface of Active Events. Detected features are tracked as candidate objects using the recently proposed Asynchronous Event Blob (AEB) tracker in order to construct small intensity patches of each candidate object. A novel learnt validation stage promotes or discards candidate objects based on classification of their intensity patches, with promoted objects having their position, velocity, size, and orientation estimated at their event rate. We evaluate AEMOT on a new Bee Swarm Dataset, where it tracks dozens of small bees with precision and recall performance exceeding that of alternative event-based detection and tracking algorithms by over 37%. Source code and the labelled event Bee Swarm Dataset will be open sourced

ROMay 13
Galilean State Estimation for Inertial Navigation Systems with Unknown Time Delay

Giulio Delama, Martin Scheiber, Yixiao Ge et al.

Many Inertial Navigation Systems (INS) use Global Navigation Satellite System (GNSS) position as the primary measurement to drive filter performance and bound error growth. However, commercial-grade GNSS receivers introduce unknown measurement delays ranging from 50 ms to 300 ms depending on sensor quality and operating mode. Such time delays can significantly degrade INS performance unless they are explicitly compensated for. Existing algorithms commonly estimate this delay offline, run the filter concurrently with GNSS measurements using buffered Inertial Measurement Unit (IMU) data, and predict the current state by forward-integrating buffered inertial measurements via IMU preintegration. The state-of-the-art online method is an Extended Kalman Filter (EKF) that explicitly models the time delay as a state parameter, which defines the preintegration duration. This paper introduces a novel geometric framework for modeling time-delayed INS, in which Galilean symmetry is leveraged to provide a joint representation of space and time for consistent state estimation. An Equivariant Filter (EqF) is derived for the coupled estimation of navigation states and time delay. Validation is performed on two fixed-wing Uncrewed Aerial Vehicles (UAV) with GNSS time lags of 90 ms and 120 ms. The test flights last two to three minutes. Simulations further investigate delays up to 500 ms and provide a statistical comparison against the state-of-the-art EKF. Results show that the EqF preserves accuracy and consistency, while the EKF lacks consistency and its performance degrades significantly with increasing measurement delays.

SYMay 11
Equivariant Observer Design on SL(3) for Image Intensity-Based Homography Estimation

Tarek Bouazza, Pieter van Goor, Robert Mahony et al.

This paper addresses the problem of homography estimation using a nonlinear observer designed on the Lie group $\mathbf{SL}(3)$ that exploits the full image information through direct image registration. Unlike traditional feature-based methods, which rely on extensive feature extraction and matching, the proposed approach formulates an observer that minimises a cost function defined directly in terms of image pixel intensities. Explicit conditions ensuring the non-degeneracy of the cost function are derived, and a comprehensive analysis is conducted to characterise and generate degenerate (unobservable) image configurations. Theoretical results demonstrate local exponential convergence of the observer. To improve local convergence properties, a second-order observer variant is introduced by incorporating the Hessian of the cost function into the correction term. Simulation results demonstrate the performance of the proposed solutions on real images.

ROMay 22, 2020Code
VDO-SLAM: A Visual Dynamic Object-aware SLAM System

Jun Zhang, Mina Henein, Robert Mahony et al.

Combining Simultaneous Localisation and Mapping (SLAM) estimation and dynamic scene modelling can highly benefit robot autonomy in dynamic environments. Robot path planning and obstacle avoidance tasks rely on accurate estimations of the motion of dynamic objects in the scene. This paper presents VDO-SLAM, a robust visual dynamic object-aware SLAM system that exploits semantic information to enable accurate motion estimation and tracking of dynamic rigid objects in the scene without any prior knowledge of the objects' shape or geometric models. The proposed approach identifies and tracks the dynamic objects and the static structure in the environment and integrates this information into a unified SLAM framework. This results in highly accurate estimates of the robot's trajectory and the full SE(3) motion of the objects as well as a spatiotemporal map of the environment. The system is able to extract linear velocity estimates from objects' SE(3) motion providing an important functionality for navigation in complex dynamic environments. We demonstrate the performance of the proposed system on a number of real indoor and outdoor datasets and the results show consistent and substantial improvements over the state-of-the-art algorithms. An open-source version of the source code is available.

CVJun 26, 2024
Real-time Structure Flow

Juan David Adarve, Robert Mahony

This article introduces the structure flow field; a flow field that can provide high-speed robo-centric motion information for motion control of highly dynamic robotic devices and autonomous vehicles. Structure flow is the angular 3D velocity of the scene at a given pixel. We show that structure flow posses an elegant evolution model in the form of a Partial Differential Equation (PDE) that enables us to create dense flow predictions forward in time. We exploit this structure to design a predictor-update algorithm to compute structure flow in real time using image and depth measurements. The prediction stage takes the previous estimate of the structure flow and propagates it forward in time using a numerical implementation of the structure flow PDE. The predicted flow is then updated using new image and depth data. The algorithm runs up to 600 Hz on a Desktop GPU machine for 512x512 images with flow values up to 8 pixels. We provide ground truth validation on high-speed synthetic image sequences as well as results on real-life video on driving scenarios.

ETSep 4, 2023
High Frequency, High Accuracy Pointing onboard Nanosats using Neuromorphic Event Sensing and Piezoelectric Actuation

Yasir Latif, Peter Anastasiou, Yonhon Ng et al.

As satellites become smaller, the ability to maintain stable pointing decreases as external forces acting on the satellite come into play. At the same time, reaction wheels used in the attitude determination and control system (ADCS) introduce high frequency jitter which can disrupt pointing stability. For space domain awareness (SDA) tasks that track objects tens of thousands of kilometres away, the pointing accuracy offered by current nanosats, typically in the range of 10 to 100 arcseconds, is not sufficient. In this work, we develop a novel payload that utilises a neuromorphic event sensor (for high frequency and highly accurate relative attitude estimation) paired in a closed loop with a piezoelectric stage (for active attitude corrections) to provide highly stable sensor-specific pointing. Event sensors are especially suited for space applications due to their desirable characteristics of low power consumption, asynchronous operation, and high dynamic range. We use the event sensor to first estimate a reference background star field from which instantaneous relative attitude is estimated at high frequency. The piezoelectric stage works in a closed control loop with the event sensor to perform attitude corrections based on the discrepancy between the current and desired attitude. Results in a controlled setting show that we can achieve a pointing accuracy in the range of 1-5 arcseconds using our novel payload at an operating frequency of up to 50Hz using a prototype built from commercial-off-the-shelf components. Further details can be found at https://ylatif.github.io/ultrafinestabilisation

CVSep 3, 2023
An Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras

Ziwei Wang, Yonhon Ng, Cedric Scheerlinck et al.

Event cameras are ideally suited to capture High Dynamic Range (HDR) visual information without blur but provide poor imaging capability for static or slowly varying scenes. Conversely, conventional image sensors measure absolute intensity of slowly changing scenes effectively but do poorly on HDR or quickly changing scenes. In this paper, we present an asynchronous linear filter architecture, fusing event and frame camera data, for HDR video reconstruction and spatial convolution that exploits the advantages of both sensor modalities. The key idea is the introduction of a state that directly encodes the integrated or convolved image information and that is updated asynchronously as each event or each frame arrives from the camera. The state can be read-off as-often-as and whenever required to feed into subsequent vision modules for real-time robotic systems. Our experimental results are evaluated on both publicly available datasets with challenging lighting conditions and fast motions, along with a new dataset with HDR reference that we provide. The proposed AKF pipeline outperforms other state-of-the-art methods in both absolute intensity error (69.4% reduction) and image similarity indexes (average 35.5% improvement). We also demonstrate the integration of image convolution with linear spatial kernels Gaussian, Sobel, and Laplacian as an application of our architecture.

ROFeb 4, 2022
Equivariant Filter Design for Inertial Navigation Systems with Input Measurement Biases

Alessandro Fornasier, Yonhon Ng, Robert Mahony et al.

Inertial Navigation Systems (INS) are a key technology for autonomous vehicles applications. Recent advances in estimation and filter design for the INS problem have exploited geometry and symmetry to overcome limitations of the classical Extended Kalman Filter (EKF) approach that formed the mainstay of INS systems since the mid-twentieth century. The industry standard INS filter, the Multiplicative Extended Kalman Filter (MEKF), uses a geometric construction for attitude estimation coupled with classical Euclidean construction for position, velocity and bias estimation. The recent Invariant Extended Kalman Filter (IEKF) provides a geometric framework for the full navigation states, integrating attitude, position and velocity, but still uses the classical Euclidean construction to model the bias states. In this paper, we use the recently proposed Equivariant Filter (EqF) framework to derive a novel observer for biased inertial-based navigation in a fully geometric framework. The introduction of virtual velocity inputs with associated virtual bias leads to a full equivariant symmetry on the augmented system. The resulting filter performance is evaluated with both simulated and real-world data, and demonstrates increased robustness to a wide range of erroneous initial conditions, and improved accuracy when compared with the industry standard Multiplicative EKF (MEKF) approach.

CVOct 11, 2021
Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception

Ziwei Wang, Liyuan Pan, Yonhon Ng et al.

Stereo camera systems play an important role in robotics applications to perceive the 3D world. However, conventional cameras have drawbacks such as low dynamic range, motion blur and latency due to the underlying frame-based mechanism. Event cameras address these limitations as they report the brightness changes of each pixel independently with a fine temporal resolution, but they are unable to acquire absolute intensity information directly. Although integrated hybrid event-frame sensors (eg., DAVIS) are available, the quality of data is compromised by coupling at the pixel level in the circuit fabrication of such cameras. This paper proposes a stereo hybrid event-frame (SHEF) camera system that offers a sensor modality with separate high-quality pure event and pure frame cameras, overcoming the limitations of each separate sensor and allowing for stereo depth estimation. We provide a SHEF dataset targeted at evaluating disparity estimation algorithms and introduce a stereo disparity estimation algorithm that uses edge information extracted from the event stream correlated with the edge detected in the frame data. Our disparity estimation outperforms the state-of-the-art stereo matching algorithm on the SHEF dataset.

ROApr 13, 2021
Inertial Collaborative Localisation for Autonomous Vehicles using a Minimum Energy Filter

Jack Henderson, Mohammad Zamani, Robert Mahony et al.

Collaborative Localisation has been studied extensively in recent years as a way to improve pose estimation of unmanned aerial vehicles in challenging environments. However little attention has been paid toward advancing the underlying filter design beyond standard Extended Kalman Filter-based approaches. In this paper, we detail a discrete-time collaborative localisation filter using the deterministic minimum-energy framework. The filter incorporates measurements from an inertial measurement unit and models the effects of sensor bias and gravitational acceleration. We present a simulation based on real-world vehicle trajectories and IMU data that demonstrates how collaborative localisation can improve performance over single-vehicle methods.

ROApr 8, 2021
An Equivariant Filter for Visual Inertial Odometry

Pieter van Goor, Robert Mahony

Visual Inertial Odometry (VIO) is of great interest due the ubiquity of devices equipped with both a monocular camera and Inertial Measurement Unit (IMU). Methods based on the extended Kalman Filter remain popular in VIO due to their low memory requirements, CPU usage, and processing time when compared to optimisation-based methods. In this paper, we analyse the VIO problem from a geometric perspective and propose a novel formulation on a smooth quotient manifold where the equivalence relationship is the well-known invariance of VIO to choice of reference frame. We propose a novel Lie group that acts transitively on this manifold and is compatible with the visual measurements. This structure allows for the application of Equivariant Filter (EqF) design leading to a novel filter for the VIO problem. Combined with a very simple vision processing front-end, the proposed filter demonstrates state-of-the-art performance on the EuRoC dataset compared to other EKF-based VIO algorithms.

CVJan 22, 2021
Iterative Optimisation with an Innovation CNN for Pose Refinement

Gerard Kennedy, Zheyu Zhuang, Xin Yu et al.

Object pose estimation from a single RGB image is a challenging problem due to variable lighting conditions and viewpoint changes. The most accurate pose estimation networks implement pose refinement via reprojection of a known, textured 3D model, however, such methods cannot be applied without high quality 3D models of the observed objects. In this work we propose an approach, namely an Innovation CNN, to object pose estimation refinement that overcomes the requirement for reprojecting a textured 3D model. Our approach improves initial pose estimation progressively by applying the Innovation CNN iteratively in a stochastic gradient descent (SGD) framework. We evaluate our method on the popular LINEMOD and Occlusion LINEMOD datasets and obtain state-of-the-art performance on both datasets.

CVDec 17, 2020
Event Camera Calibration of Per-pixel Biased Contrast Threshold

Ziwei Wang, Yonhon Ng, Pieter van Goor et al.

Event cameras output asynchronous events to represent intensity changes with a high temporal resolution, even under extreme lighting conditions. Currently, most of the existing works use a single contrast threshold to estimate the intensity change of all pixels. However, complex circuit bias and manufacturing imperfections cause biased pixels and mismatch contrast threshold among pixels, which may lead to undesirable outputs. In this paper, we propose a new event camera model and two calibration approaches which cover event-only cameras and hybrid image-event cameras. When intensity images are simultaneously provided along with events, we also propose an efficient online method to calibrate event cameras that adapts to time-varying event rates. We demonstrate the advantages of our proposed methods compared to the state-of-the-art on several different event camera datasets.

CVDec 10, 2020
An Asynchronous Kalman Filter for Hybrid Event Cameras

Ziwei Wang, Yonhon Ng, Cedric Scheerlinck et al.

Event cameras are ideally suited to capture HDR visual information without blur but perform poorly on static or slowly changing scenes. Conversely, conventional image sensors measure absolute intensity of slowly changing scenes effectively but do poorly on high dynamic range or quickly changing scenes. In this paper, we present an event-based video reconstruction pipeline for High Dynamic Range (HDR) scenarios. The proposed algorithm includes a frame augmentation pre-processing step that deblurs and temporally interpolates frame data using events. The augmented frame and event data are then fused using a novel asynchronous Kalman filter under a unifying uncertainty model for both sensors. Our experimental results are evaluated on both publicly available datasets with challenging lighting conditions and fast motions and our new dataset with HDR reference. The proposed algorithm outperforms state-of-the-art methods in both absolute intensity error (48% reduction) and image similarity indexes (average 11% improvement).

SYSep 10, 2020
A Minimum Energy Filter for Localisation of an Unmanned Aerial Vehicle

Jack Henderson, Mohammad Zamani, Robert Mahony et al.

Accurate localisation of unmanned aerial vehicles is vital for the next generation of automation tasks. This paper proposes a minimum energy filter for velocity-aided pose estimation on the extended special Euclidean group. The approach taken exploits the Lie-group symmetry of the problem to combine Inertial Measurement Unit (IMU) sensor output with landmark measurements into a robust and high performance state estimate. We propose an asynchronous discrete-time implementation to fuse high bandwidth IMU with low bandwidth discrete-time landmark measurements typical of real-world scenarios. The filter's performance is demonstrated by simulation.

CVAug 6, 2020
Shonan Rotation Averaging: Global Optimality by Surfing $SO(p)^n$

Frank Dellaert, David M. Rosen, Jing Wu et al.

Shonan Rotation Averaging is a fast, simple, and elegant rotation averaging algorithm that is guaranteed to recover globally optimal solutions under mild assumptions on the measurement noise. Our method employs semidefinite relaxation in order to recover provably globally optimal solutions of the rotation averaging problem. In contrast to prior work, we show how to solve large-scale instances of these relaxations using manifold minimization on (only slightly) higher-dimensional rotation manifolds, re-using existing high-performance (but local) structure-from-motion pipelines. Our method thus preserves the speed and scalability of current SFM methods, while recovering globally optimal solutions.

ROJul 28, 2020
Robust Ego and Object 6-DoF Motion Estimation and Tracking

Jun Zhang, Mina Henein, Robert Mahony et al.

The problem of tracking self-motion as well as motion of objects in the scene using information from a camera is known as multi-body visual odometry and is a challenging task. This paper proposes a robust solution to achieve accurate estimation and consistent track-ability for dynamic multi-body visual odometry. A compact and effective framework is proposed leveraging recent advances in semantic instance-level segmentation and accurate optical flow estimation. A novel formulation, jointly optimizing SE(3) motion and optical flow is introduced that improves the quality of the tracked points and the motion estimation accuracy. The proposed approach is evaluated on the virtual KITTI Dataset and tested on the real KITTI Dataset, demonstrating its applicability to autonomous driving applications. For the benefit of the community, we make the source code public.

ROMay 29, 2020
An Observer Design for Visual Simultaneous Localisation and Mapping with Output Equivariance

Pieter van Goor, Robert Mahony, Tarek Hamel et al.

Visual Simultaneous Localisation and Mapping (VSLAM) is a key enabling technology for small embedded robotic systems such as aerial vehicles. Recent advances in equivariant filter and observer design offer the potential of a new generation of highly robust algorithms with low memory and computation requirements for embedded system applications. This paper studies observer design on the symmetry group proposed in previous work by the authors, in the case where inverse depth measurements are available. Exploiting this symmetry leads to a simple fully non-linear gradient based observer with almost global asymptotic and local exponential stability properties. Simulation experiments verify the observer design, and demonstrate that the proposed observer achieves similar accuracy to the widely used Extended Kalman Filter with significant gains in processing time (linear verses quadratic bounds with respect to number of landmarks) and qualitative improvements in robustness.

ROMay 25, 2020
LyRN (Lyapunov Reaching Network): A Real-Time Closed Loop approach from Monocular Vision

Zheyu Zhuang, Xin Yu, Robert Mahony

We propose a closed-loop, multi-instance control algorithm for visually guided reaching based on novel learning principles. A control Lyapunov function methodology is used to design a reaching action for a complex multi-instance task in the case where full state information (poses of all potential reaching points) is available. The proposed algorithm uses monocular vision and manipulator joint angles as the input to a deep convolution neural network to predict the value of the control Lyapunov function (cLf) and corresponding velocity control. The resulting network output is used in real-time as visual control for the grasping task with the multi-instance capability emerging naturally from the design of the control Lyapunov function. We demonstrate the proposed algorithm grasping mugs (textureless and symmetric objects) on a table-top from an over-the-shoulder monocular RGB camera. The manipulator dynamically converges to the best-suited target among multiple identical instances from any random initial pose within the workspace. The system trained with only simulated data is able to achieve 90.3% grasp success rate in the real-world experiments with up to 85Hz closed-loop control on one GTX 1080Ti GPU and significantly outperforms a Pose-Based-Visual-Servo (PBVS) grasping system adapted from a state-of-the-art single shot RGB 6D pose estimation algorithm. A key contribution of the paper is the inclusion of a first-order differential constraint associated with the cLf as a regularisation term during learning, and we provide evidence that this leads to more robust and reliable reaching/grasping performance than vanilla regression on general control inputs.

CVMar 20, 2020
Reducing the Sim-to-Real Gap for Event Cameras

Timo Stoffregen, Cedric Scheerlinck, Davide Scaramuzza et al.

Event cameras are paradigm-shifting novel sensors that report asynchronous, per-pixel brightness changes called 'events' with unparalleled low latency. This makes them ideal for high speed, high dynamic range scenes where conventional cameras would fail. Recent work has demonstrated impressive results using Convolutional Neural Networks (CNNs) for video reconstruction and optic flow with events. We present strategies for improving training data for event based CNNs that result in 20-40% boost in performance of existing state-of-the-art (SOTA) video reconstruction networks retrained with our method, and up to 15% for optic flow networks. A challenge in evaluating event based video reconstruction is lack of quality ground truth images in existing datasets. To address this, we present a new High Quality Frames (HQF) dataset, containing events and ground truth frames from a DAVIS240C that are well-exposed and minimally motion-blurred. We evaluate our method on HQF + several existing major event camera datasets.

ROFeb 20, 2020
Dynamic SLAM: The Need For Speed

Mina Henein, Jun Zhang, Robert Mahony et al.

The static world assumption is standard in most simultaneous localisation and mapping (SLAM) algorithms. Increased deployment of autonomous systems to unstructured dynamic environments is driving a need to identify moving objects and estimate their velocity in real-time. Most existing SLAM based approaches rely on a database of 3D models of objects or impose significant motion constraints. In this paper, we propose a new feature-based, model-free, object-aware dynamic SLAM algorithm that exploits semantic segmentation to allow estimation of motion of rigid objects in a scene without the need to estimate the object poses or have any prior knowledge of their 3D models. The algorithm generates a map of dynamic and static structure and has the ability to extract velocities of rigid moving objects in the scene. Its performance is demonstrated on simulated, synthetic and real-world datasets.

CVApr 24, 2019
CED: Color Event Camera Dataset

Cedric Scheerlinck, Henri Rebecq, Timo Stoffregen et al.

Event cameras are novel, bio-inspired visual sensors, whose pixels output asynchronous and independent timestamped spikes at local intensity changes, called 'events'. Event cameras offer advantages over conventional frame-based cameras in terms of latency, high dynamic range (HDR) and temporal resolution. Until recently, event cameras have been limited to outputting events in the intensity channel, however, recent advances have resulted in the development of color event cameras, such as the Color-DAVIS346. In this work, we present and release the first Color Event Camera Dataset (CED), containing 50 minutes of footage with both color frames and events. CED features a wide variety of indoor and outdoor scenes, which we hope will help drive forward event-based vision research. We also present an extension of the event camera simulator ESIM that enables simulation of color events. Finally, we present an evaluation of three state-of-the-art image reconstruction methods that can be used to convert the Color-DAVIS346 into a continuous-time, HDR, color video camera to visualise the event stream, and for use in downstream vision applications.

ROApr 4, 2019
An Equivariant Observer Design for Visual Localisation and Mapping

Pieter van Goor, Robert Mahony, Tarek Hamel et al.

This paper builds on recent work on Simultaneous Localisation and Mapping (SLAM) in the non-linear observer community, by framing the visual localisation and mapping problem as a continuous-time equivariant observer design problem on the symmetry group of a kinematic system. The state-space is a quotient of the robot pose expressed on SE(3) and multiple copies of real projective space, used to represent both points in space and bearings in a single unified framework. An observer with decoupled Riccati-gains for each landmark is derived and we show that its error system is almost globally asymptotically stable and exponentially stable in-the-large.

CVDec 2, 2018
Asynchronous Spatial Image Convolutions for Event Cameras

Cedric Scheerlinck, Nick Barnes, Robert Mahony

Spatial convolution is arguably the most fundamental of 2D image processing operations. Conventional spatial image convolution can only be applied to a conventional image, that is, an array of pixel values (or similar image representation) that are associated with a single instant in time. Event cameras have serial, asynchronous output with no natural notion of an image frame, and each event arrives with a different timestamp. In this paper, we propose a method to compute the convolution of a linear spatial kernel with the output of an event camera. The approach operates on the event stream output of the camera directly without synthesising pseudo-image frames as is common in the literature. The key idea is the introduction of an internal state that directly encodes the convolved image information, which is updated asynchronously as each event arrives from the camera. The state can be read-off as-often-as and whenever required for use in higher level vision algorithms for real-time robotic systems. We demonstrate the application of our method to corner detection, providing an implementation of a Harris corner-response "state" that can be used in real-time for feature detection and tracking on robotic systems.

CVNov 1, 2018
Continuous-time Intensity Estimation Using Event Cameras

Cedric Scheerlinck, Nick Barnes, Robert Mahony

Event cameras provide asynchronous, data-driven measurements of local temporal contrast over a large dynamic range with extremely high temporal resolution. Conventional cameras capture low-frequency reference intensity information. These two sensor modalities provide complementary information. We propose a computationally efficient, asynchronous filter that continuously fuses image frames and events into a single high-temporal-resolution, high-dynamic-range image state. In absence of conventional image frames, the filter can be run on events only. We present experimental results on high-speed, high-dynamic-range sequences, as well as on new ground truth datasets we generate to demonstrate the proposed algorithm outperforms existing state-of-the-art methods.

ROMay 10, 2018
Simultaneous Localization and Mapping with Dynamic Rigid Objects

Mina Henein, Gerard Kennedy, Viorela Ila et al.

Accurate estimation of the environment structure simultaneously with the robot pose is a key capability of autonomous robotic vehicles. Classical simultaneous localization and mapping (SLAM) algorithms rely on the static world assumption to formulate the estimation problem, however, the real world has a significant amount of dynamics that can be exploited for a more accurate localization and versatile representation of the environment. In this paper we propose a technique to integrate the motion of dynamic objects into the SLAM estimation problem, without the necessity of estimating the pose or the geometry of the objects. To this end, we introduce a novel representation of the pose change of rigid bodies in motion and show the benefits of integrating such information when performing SLAM in dynamic environments. Our experiments show consistent improvement in robot localization and mapping accuracy when using a simple constant motion assumption, even for objects whose motion slightly violates this assumption.

ROApr 4, 2017
A Discrete-Time Attitude Observer on SO(3) for Vision and GPS Fusion

Alireza Khosravian, Tat-Jun Chin, Ian Reid et al.

This paper proposes a discrete-time geometric attitude observer for fusing monocular vision with GPS velocity measurements. The observer takes the relative transformations obtained from processing monocular images with any visual odometry algorithm and fuses them with GPS velocity measurements. The objectives of this sensor fusion are twofold; first to mitigate the inherent drift of the attitude estimates of the visual odometry, and second, to estimate the orientation directly with respect to the North-East-Down frame. A key contribution of the paper is to present a rigorous stability analysis showing that the attitude estimates of the observer converge exponentially to the true attitude and to provide a lower bound for the convergence rate of the observer. Through experimental studies, we demonstrate that the observer effectively compensates for the inherent drift of the pure monocular vision based attitude estimation and is able to recover the North-East-Down orientation even if it is initialized with a very large attitude error.

ROJan 16, 2017
3D tracking of water hazards with polarized stereo cameras

Chuong V. Nguyen, Michael Milford, Robert Mahony

Current self-driving car systems operate well in sunny weather but struggle in adverse conditions. One of the most commonly encountered adverse conditions involves water on the road caused by rain, sleet, melting snow or flooding. While some advances have been made in using conventional RGB camera and LIDAR technology for detecting water hazards, other sources of information such as polarization offer a promising and potentially superior approach to this problem in terms of performance and cost. In this paper, we present a novel stereo-polarization system for detecting and tracking water hazards based on polarization and color variation of reflected light, with consideration of the effect of polarized light from sky as function of reflection and azimuth angles. To evaluate this system, we present a new large `water on road' datasets spanning approximately 2 km of driving in various on-road and off-road conditions and demonstrate for the first time reliable water detection and tracking over a wide range of realistic car driving water conditions using polarized vision as the primary sensing modality. Our system successfully detects water hazards up to more than 100m. Finally, we discuss several interesting challenges and propose future research directions for further improving robust autonomous car perception in hazardous wet conditions using polarization sensors.

CVJun 9, 2016
Feature-based Recursive Observer Design for Homography Estimation

Minh-Duc Hua, Jochen Trumpf, Tarek Hamel et al.

This paper presents a new algorithm for online estimation of a sequence of homographies applicable to image sequences obtained from robotic vehicles equipped with vision sensors. The approach taken exploits the underlying Special Linear group structure of the set of homographies along with gyroscope measurements and direct point-feature correspondences between images to develop temporal filter for the homography estimate. Theoretical analysis and experimental results are provided to demonstrate the robustness of the proposed algorithm. The experimental results show excellent performance even in the case of very fast camera motion (relative to frame rate), severe occlusion, and in the presence of specular reflections.