ROSep 14, 2021
Large-scale Autonomous Flight with Real-time Semantic SLAM under Dense Forest CanopyXu Liu, Guilherme V. Nardari, Fernando Cladera Ojeda et al.
Semantic maps represent the environment using a set of semantically meaningful objects. This representation is storage-efficient, less ambiguous, and more informative, thus facilitating large-scale autonomy and the acquisition of actionable information in highly unstructured, GPS-denied environments. In this letter, we propose an integrated system that can perform large-scale autonomous flights and real-time semantic mapping in challenging under-canopy environments. We detect and model tree trunks and ground planes from LiDAR data, which are associated across scans and used to constrain robot poses as well as tree trunk models. The autonomous navigation module utilizes a multi-level planning and mapping framework and computes dynamically feasible trajectories that lead the UAV to build a semantic map of the user-defined region of interest in a computationally and storage efficient manner. A drift-compensation mechanism is designed to minimize the odometry drift using semantic SLAM outputs in real time, while maintaining planner optimality and controller stability. This leads the UAV to execute its mission accurately and safely at scale.
CVSep 23, 2020
Place Recognition in Forests with Urquhart TessellationsGuilherme V. Nardari, Avraham Cohen, Steven W. Chen et al.
In this letter, we present a novel descriptor based on Urquhart tessellations derived from the position of trees in a forest. We propose a framework that uses these descriptors to detect previously seen observations and landmark correspondences, even with partial overlap and noise. We run loop closure detection experiments in simulation and real-world data map-merging from different flights of an Unmanned Aerial Vehicle (UAV) in a pine tree forest and show that our method outperforms state-of-the-art approaches in accuracy and robustness.
RODec 29, 2019
SLOAM: Semantic Lidar Odometry and Mapping for Forest InventorySteven W. Chen, Guilherme V. Nardari, Elijah S. Lee et al.
This paper describes an end-to-end pipeline for tree diameter estimation based on semantic segmentation and lidar odometry and mapping. Accurate mapping of this type of environment is challenging since the ground and the trees are surrounded by leaves, thorns and vines, and the sensor typically experiences extreme motion. We propose a semantic feature based pose optimization that simultaneously refines the tree models while estimating the robot pose. The pipeline utilizes a custom virtual reality tool for labeling 3D scans that is used to train a semantic segmentation network. The masked point cloud is used to compute a trellis graph that identifies individual instances and extracts relevant features that are used by the SLAM module. We show that traditional lidar and image based methods fail in the forest environment on both Unmanned Aerial Vehicle (UAV) and hand-carry systems, while our method is more robust, scalable, and automatically generates tree diameter estimations.
LGOct 23, 2019
Large Scale Model Predictive Control with Neural Networks and Primal Active SetsSteven W. Chen, Tianyu Wang, Nikolay Atanasov et al.
This work presents an explicit-implicit procedure to compute a model predictive control (MPC) law with guarantees on recursive feasibility and asymptotic stability. The approach combines an offline-trained fully-connected neural network with an online primal active set solver. The neural network provides a control input initialization while the primal active set method ensures recursive feasibility and asymptotic stability. The neural network is trained with a primal-dual loss function, aiming to generate control sequences that are primal feasible and meet a desired level of suboptimality. Since the neural network alone does not guarantee constraint satisfaction, its output is used to warm start the primal active set method online. We demonstrate that this approach scales to large problems with thousands of optimization variables, which are challenging for current approaches. Our method achieves a 2x reduction in online inference time compared to the best method in a benchmark suite of different solver and initialization strategies.
CVFeb 2, 2019
DFuseNet: Deep Fusion of RGB and Sparse Depth Information for Image Guided Dense Depth CompletionShreyas S. Shivakumar, Ty Nguyen, Ian D. Miller et al.
In this paper we propose a convolutional neural network that is designed to upsample a series of sparse range measurements based on the contextual cues gleaned from a high resolution intensity image. Our approach draws inspiration from related work on super-resolution and in-painting. We propose a novel architecture that seeks to pull contextual cues separately from the intensity image and the depth features and then fuse them later in the network. We argue that this approach effectively exploits the relationship between the two modalities and produces accurate results while respecting salient image structures. We present experimental results to demonstrate that our approach is comparable with state of the art methods and generalizes well across multiple datasets.
ROJan 24, 2019
Decentralization of Multiagent Policies by Learning What to CommunicateJames Paulos, Steven W. Chen, Daigo Shishika et al.
Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system's ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert's observability assumptions are broken.
RONov 4, 2018
Monocular Camera Based Fruit Counting and Mapping with Semantic Data AssociationXu Liu, Steven W. Chen, Chenhao Liu et al.
We present a cheap, lightweight, and fast fruit counting pipeline that uses a single monocular camera. Our pipeline that relies only on a monocular camera, achieves counting performance comparable to state-of-the-art fruit counting system that utilizes an expensive sensor suite including LiDAR and GPS/INS on a mango dataset. Our monocular camera pipeline begins with a fruit detection component that uses a deep neural network. It then uses semantic structure from motion (SFM) to convert these detections into fruit counts by estimating landmark locations of the fruit in 3D, and using these landmarks to identify double counting scenarios. There are many benefits of developing a low cost and lightweight fruit counting system, including applicability to agriculture in developing countries, where monetary constraints or unstructured environments necessitate cheaper hardware solutions.
CVApr 1, 2018
Robust Fruit Counting: Combining Deep Learning, Tracking, and Structure from MotionXu Liu, Steven W. Chen, Shreyas Aditya et al.
We present a novel fruit counting pipeline that combines deep segmentation, frame to frame tracking, and 3D localization to accurately count visible fruits across a sequence of images. Our pipeline works on image streams from a monocular camera, both in natural light, as well as with controlled illumination at night. We first train a Fully Convolutional Network (FCN) and segment video frame images into fruit and non-fruit pixels. We then track fruits across frames using the Hungarian Algorithm where the objective cost is determined from a Kalman Filter corrected Kanade-Lucas-Tomasi (KLT) Tracker. In order to correct the estimated count from tracking process, we combine tracking results with a Structure from Motion (SfM) algorithm to calculate relative 3D locations and size estimates to reject outliers and double counted fruit tracks. We evaluate our algorithm by comparing with ground-truth human-annotated visual counts. Our results demonstrate that our pipeline is able to accurately and reliably count fruits across image sequences, and the correction step can significantly improve the counting accuracy and robustness. Although discussed in the context of fruit counting, our work can extend to detection, tracking, and counting of a variety of other stationary features of interest such as leaf-spots, wilt, and blossom.
CVSep 12, 2017
Unsupervised Deep Homography: A Fast and Robust Homography Estimation ModelTy Nguyen, Steven W. Chen, Shreyas S. Shivakumar et al.
Homography estimation between multiple aerial images can provide relative pose estimation for collaborative autonomous exploration and monitoring. The usage on a robotic system requires a fast and robust homography estimation algorithm. In this study, we propose an unsupervised learning algorithm that trains a Deep Convolutional Neural Network to estimate planar homographies. We compare the proposed algorithm to traditional feature-based and direct methods, as well as a corresponding supervised learning algorithm. Our empirical results demonstrate that compared to traditional approaches, the unsupervised algorithm achieves faster inference speed, while maintaining comparable or better accuracy and robustness to illumination variation. In addition, on both a synthetic dataset and representative real-world aerial dataset, our unsupervised method has superior adaptability and performance compared to the supervised deep learning method.