ROSep 9, 2021
Keeping an Eye on Things: Deep Learned Features for Long-Term Visual LocalizationMona Gridseth, Timothy D. Barfoot
In this paper, we learn visual features that we use to first build a map and then localize a robot driving autonomously across a full day of lighting change, including in the dark. We train a neural network to predict sparse keypoints with associated descriptors and scores that can be used together with a classical pose estimator for localization. Our training pipeline includes a differentiable pose estimator such that training can be supervised with ground truth poses from data collected earlier, in our case from 2016 and 2017 gathered with multi-experience Visual Teach and Repeat (VT&R). We insert the learned features into the existing VT&R pipeline to perform closed-loop path following in unstructured outdoor environments. We show successful path following across all lighting conditions despite the robot's map being constructed using daylight conditions. Moreover, we explore generalizability of the features by driving the robot across all lighting conditions in new areas not present in the feature training dataset. In all, we validated our approach with 35.5 km of autonomous path following experiments in challenging conditions.
ROFeb 22, 2021
Unsupervised Learning of Lidar Features for Use in a Probabilistic Trajectory EstimatorDavid J. Yoon, Haowei Zhang, Mona Gridseth et al.
We present unsupervised parameter learning in a Gaussian variational inference setting that combines classic trajectory estimation for mobile robots with deep learning for rich sensor data, all under a single learning objective. The framework is an extension of an existing system identification method that optimizes for the observed data likelihood, which we improve with modern advances in batch trajectory estimation and deep learning. Though the framework is general to any form of parameter learning and sensor modality, we demonstrate application to feature and uncertainty learning with a deep network for 3D lidar odometry. Our framework learns from only the on-board lidar data, and does not require any form of groundtruth supervision. We demonstrate that our lidar odometry performs better than existing methods that learn the full estimator with a deep network, and comparable to state-of-the-art ICP-based methods on the KITTI odometry dataset. We additionally show results on lidar data from the Oxford RobotCar dataset.
RODec 10, 2020
Self-Supervised Learning of Lidar Segmentation for Autonomous Indoor NavigationHugues Thomas, Ben Agro, Mona Gridseth et al.
We present a self-supervised learning approach for the semantic segmentation of lidar frames. Our method is used to train a deep point cloud segmentation architecture without any human annotation. The annotation process is automated with the combination of simultaneous localization and mapping (SLAM) and ray-tracing algorithms. By performing multiple navigation sessions in the same environment, we are able to identify permanent structures, such as walls, and disentangle short-term and long-term movable objects, such as people and tables, respectively. New sessions can then be performed using a network trained to predict these semantic labels. We demonstrate the ability of our approach to improve itself over time, from one session to the next. With semantically filtered point clouds, our robot can navigate through more complex scenarios, which, when added to the training pool, help to improve our network predictions. We provide insights into our network predictions and show that our approach can also improve the performances of common localization techniques.
ROMar 5, 2020
DeepMEL: Compiling Visual Multi-Experience Localization into a Deep Neural NetworkMona Gridseth, Timothy D. Barfoot
Vision-based path following allows robots to autonomously repeat manually taught paths. Stereo Visual Teach and Repeat (VT\&R) accomplishes accurate and robust long-range path following in unstructured outdoor environments across changing lighting, weather, and seasons by relying on colour-constant imaging and multi-experience localization. We leverage multi-experience VT\&R together with two datasets of outdoor driving on two separate paths spanning different times of day, weather, and seasons to teach a deep neural network to predict relative pose for visual odometry (VO) and for localization with respect to a path. In this paper we run experiments exclusively on datasets to study how the network generalizes across environmental conditions. Based on the results we believe that our system achieves relative pose estimates sufficiently accurate for in-the-loop path following and that it is able to localize radically different conditions against each other directly (i.e. winter to spring and day to night), a capability that our hand-engineered system does not have.
CVApr 1, 2019
Learning Matchable Image Transformations for Long-term Metric Visual LocalizationLee Clement, Mona Gridseth, Justin Tomasi et al.
Long-term metric self-localization is an essential capability of autonomous mobile robots, but remains challenging for vision-based systems due to appearance changes caused by lighting, weather, or seasonal variations. While experience-based mapping has proven to be an effective technique for bridging the `appearance gap,' the number of experiences required for reliable metric localization over days or months can be very large, and methods for reducing the necessary number of experiences are needed for this approach to scale. Taking inspiration from color constancy theory, we learn a nonlinear RGB-to-grayscale mapping that explicitly maximizes the number of inlier feature matches for images captured under different lighting and weather conditions, and use it as a pre-processing step in a conventional single-experience localization pipeline to improve its robustness to appearance change. We train this mapping by approximating the target non-differentiable localization pipeline with a deep neural network, and find that incorporating a learned low-dimensional context feature can further improve cross-appearance feature matching. Using synthetic and real-world datasets, we demonstrate substantial improvements in localization performance across day-night cycles, enabling continuous metric localization over a 30-hour period using a single mapping experience, and allowing experience-based localization to scale to long deployments with dramatically reduced data requirements.
RONov 3, 2018
Building a Winning Self-Driving Car in Six MonthsKeenan Burnett, Andreas Schimpe, Sepehr Samavi et al.
The SAE AutoDrive Challenge is a three-year competition to develop a Level 4 autonomous vehicle by 2020. The first set of challenges were held in April of 2018 in Yuma, Arizona. Our team (aUToronto/Zeus) placed first. In this paper, we describe our complete system architecture and specialized algorithms that enabled us to win. We show that it is possible to develop a vehicle with basic autonomy features in just six months relying on simple, robust algorithms. We do not make use of a prior map. Instead, we have developed a multi-sensor visual localization solution. All of our algorithms run in real-time using CPUs only. We also highlight the closed-loop performance of our system in detail in several experiments.