CVNov 24, 2023
GPT-4V Takes the Wheel: Promises and Challenges for Pedestrian Behavior PredictionJia Huang, Peng Jiang, Alvika Gautam et al.
Predicting pedestrian behavior is the key to ensure safety and reliability of autonomous vehicles. While deep learning methods have been promising by learning from annotated video frame sequences, they often fail to fully grasp the dynamic interactions between pedestrians and traffic, crucial for accurate predictions. These models also lack nuanced common sense reasoning. Moreover, the manual annotation of datasets for these models is expensive and challenging to adapt to new situations. The advent of Vision Language Models (VLMs) introduces promising alternatives to these issues, thanks to their advanced visual and causal reasoning skills. To our knowledge, this research is the first to conduct both quantitative and qualitative evaluations of VLMs in the context of pedestrian behavior prediction for autonomous driving. We evaluate GPT-4V(ision) on publicly available pedestrian datasets: JAAD and WiDEVIEW. Our quantitative analysis focuses on GPT-4V's ability to predict pedestrian behavior in current and future frames. The model achieves a 57% accuracy in a zero-shot manner, which, while impressive, is still behind the state-of-the-art domain-specific models (70%) in predicting pedestrian crossing actions. Qualitatively, GPT-4V shows an impressive ability to process and interpret complex traffic scenarios, differentiate between various pedestrian behaviors, and detect and analyze groups. However, it faces challenges, such as difficulty in detecting smaller pedestrians and assessing the relative motion between pedestrians and the ego vehicle.
CVJun 24, 2022
Contrastive Learning of Features between Images and LiDARPeng Jiang, Srikanth Saripalli
Image and Point Clouds provide different information for robots. Finding the correspondences between data from different sensors is crucial for various tasks such as localization, mapping, and navigation. Learning-based descriptors have been developed for single sensors; there is little work on cross-modal features. This work treats learning cross-modal features as a dense contrastive learning problem. We propose a Tuple-Circle loss function for cross-modality feature learning. Furthermore, to learn good features and not lose generality, we developed a variant of widely used PointNet++ architecture for point cloud and U-Net CNN architecture for images. Moreover, we conduct experiments on a real-world dataset to show the effectiveness of our loss function and network structure. We show that our models indeed learn information from both images as well as LiDAR by visualizing the features.
CVOct 20, 2023
ROSS: Radar Off-road Semantic SegmentationPeng Jiang, Srikanth Saripalli
As the demand for autonomous navigation in off-road environments increases, the need for effective solutions to understand these surroundings becomes essential. In this study, we confront the inherent complexities of semantic segmentation in RADAR data for off-road scenarios. We present a novel pipeline that utilizes LIDAR data and an existing annotated off-road LIDAR dataset for generating RADAR labels, in which the RADAR data are represented as images. Validated with real-world datasets, our pragmatic approach underscores the potential of RADAR technology for navigation applications in off-road environments.
ROSep 26, 2022
CAMEL: Learning Cost-maps Made Easy for Off-road DrivingKasi Vishwanath, P. B. Sujit, Srikanth Saripalli
Cost-maps are used by robotic vehicles to plan collision-free paths. The cost associated with each cell in the map represents the sensed environment information which is often determined manually after several trial-and-error efforts. In off-road environments, due to the presence of several types of features, it is challenging to handcraft the cost values associated with each feature. Moreover, different handcrafted cost values can lead to different paths for the same environment which is not desirable. In this paper, we address the problem of learning the cost-map values from the sensed environment for robust vehicle path planning. We propose a novel framework called as CAMEL using deep learning approach that learns the parameters through demonstrations yielding an adaptive and robust cost-map for path planning. CAMEL has been trained on multi-modal datasets such as RELLIS-3D. The evaluation of CAMEL is carried out on an off-road scene simulator (MAVS) and on field data from IISER-B campus. We also perform realworld implementation of CAMEL on a ground rover. The results shows flexible and robust motion of the vehicle without collisions in unstructured terrains.
CVJan 2, 2024Code
Off-Road LiDAR Intensity Based Semantic SegmentationKasi Viswanath, Peng Jiang, Sujit PB et al.
LiDAR is used in autonomous driving to provide 3D spatial information and enable accurate perception in off-road environments, aiding in obstacle detection, mapping, and path planning. Learning-based LiDAR semantic segmentation utilizes machine learning techniques to automatically classify objects and regions in LiDAR point clouds. Learning-based models struggle in off-road environments due to the presence of diverse objects with varying colors, textures, and undefined boundaries, which can lead to difficulties in accurately classifying and segmenting objects using traditional geometric-based features. In this paper, we address this problem by harnessing the LiDAR intensity parameter to enhance object segmentation in off-road environments. Our approach was evaluated in the RELLIS-3D data set and yielded promising results as a preliminary analysis with improved mIoU for classes "puddle" and "grass" compared to more complex deep learning-based benchmarks. The methodology was evaluated for compatibility across both Velodyne and Ouster LiDAR systems, assuring its cross-platform applicability. This analysis advocates for the incorporation of calibrated intensity as a supplementary input, aiming to enhance the prediction accuracy of learning based semantic segmentation frameworks. https://github.com/MOONLABIISERB/lidar-intensity-predictor/tree/main
ROSep 27, 2021Code
G-VOM: A GPU Accelerated Voxel Off-Road Mapping SystemTimothy Overbye, Srikanth Saripalli
We present a local 3D voxel mapping framework for off-road path planning and navigation. Our method provides both hard and soft positive obstacle detection, negative obstacle detection, slope estimation, and roughness estimation. By using a 3D array lookup table data structure and by leveraging the GPU it can provide online performance. We then demonstrate the system working on three vehicles, a Clearpath Robotics Warthog, Moose, and a Polaris Ranger, and compare against a set of pre-recorded waypoints. This was done at 4.5 m/s in autonomous operation and 12 m/s in manual operation with a map update rate of 10 Hz. Finally, an open-source ROS implementation is provided. https://github.com/unmannedlab/G-VOM
ROSep 16, 2021Code
ROOAD: RELLIS Off-road Odometry Analysis DatasetGeorge Chustz, Srikanth Saripalli
The development and implementation of visual-inertial odometry (VIO) has focused on structured environments, but interest in localization in off-road environments is growing. In this paper, we present the RELLIS Off-road Odometry Analysis Dataset (ROOAD) which provides high-quality, time-synchronized off-road monocular visual-inertial data sequences to further the development of related research. We evaluated the dataset on two state-of-the-art VIO algorithms, (1) Open-VINS and (2) VINS-Fusion. Our findings indicate that both algorithms perform 2 to 30 times worse on the ROOAD dataset compared to their performance in structured environments. Furthermore, OpenVINS has better tracking stability and real-time performance than VINS-Fusion in the off-road environment, while VINS-Fusion outperformed OpenVINS in tracking accuracy in several data sequences. Since the camera-IMU calibration tool from Kalibr toolkit is used extensively in this work, we have included several calibration data sequences. Our hand measurements show Kalibr's tool achieved +/-1 degree for orientation error and +/-1 mm at best (x- and y-axis) and +/-10 mm (z-axis) at worse for position error in the camera frame between the camera and IMU. This novel dataset provides a new set of scenarios for researchers to design and test their localization algorithms on, as well as critical insights in the current performance of VIO in off-road environments. ROOAD Dataset: github.com/unmannedlab/ROOAD
ROApr 29, 2021Code
Vehicular Teamwork: Collaborative localization of Autonomous VehiclesJacob Hartzer, Srikanth Saripalli
This paper develops a distributed collaborative localization algorithm based on an extended kalman filter. This algorithm incorporates Ultra-Wideband (UWB) measurements for vehicle to vehicle ranging, and shows improvements in localization accuracy where GPS typically falls short. The algorithm was first tested in a newly created open-source simulation environment that emulates various numbers of vehicles and sensors while simultaneously testing multiple localization algorithms. Predicted error distributions for various algorithms are quickly producible using the Monte-Carlo method and optimization techniques within MatLab. The simulation results were validated experimentally in an outdoor, urban environment. Improvements of localization accuracy over a typical extended kalman filter ranged from 2.9% to 9.3% over 180 meter test runs. When GPS was denied, these improvements increased up to 83.3% over a standard kalman filter. In both simulation and experimentally, the DCL algorithm was shown to be a good approximation of a full state filter, while reducing required communication between vehicles. These results are promising in showing the efficacy of adding UWB ranging sensors to cars for collaborative and landmark localization, especially in GPS-denied environments. In the future, additional moving vehicles with additional tags will be tested in other challenging GPS denied environments.
ROApr 25, 2021Code
Target-free Extrinsic Calibration of a 3D-Lidar and an IMUSubodh Mishra, Gaurav Pandey, Srikanth Saripalli
This work presents a novel target-free extrinsic calibration algorithm for a 3D Lidar and an IMU pair using an Extended Kalman Filter (EKF) which exploits the \textit{motion based calibration constraint} for state update. The steps include, data collection by motion excitation of the Lidar Inertial Sensor suite along all degrees of freedom, determination of the inter sensor rotation by using rotational component of the aforementioned \textit{motion based calibration constraint} in a least squares optimization framework, and finally, the determination of inter sensor translation using the \textit{motion based calibration constraint} for state update in an Extended Kalman Filter (EKF) framework. We experimentally validate our method using data collected in our lab and open-source (https://github.com/unmannedlab/imu_lidar_calibration) our contribution for the robotics research community.
CVNov 17, 2020Code
RELLIS-3D Dataset: Data, Benchmarks and AnalysisPeng Jiang, Philip Osteen, Maggie Wigness et al.
Semantic scene understanding is crucial for robust and safe autonomous navigation, particularly so in off-road environments. Recent deep learning advances for 3D semantic segmentation rely heavily on large sets of training data, however existing autonomy datasets either represent urban environments or lack multimodal off-road data. We fill this gap with RELLIS-3D, a multimodal dataset collected in an off-road environment, which contains annotations for 13,556 LiDAR scans and 6,235 images. The data was collected on the Rellis Campus of Texas A\&M University and presents challenges to existing algorithms related to class imbalance and environmental topography. Additionally, we evaluate the current state-of-the-art deep learning semantic segmentation models on this dataset. Experimental results show that RELLIS-3D presents challenges for algorithms designed for segmentation in urban environments. This novel dataset provides the resources needed by researchers to continue to develop more advanced algorithms and investigate new research directions to enhance autonomous navigation in off-road environments. RELLIS-3D is available at https://github.com/unmannedlab/RELLIS-3D
CVMar 17, 2024
3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalizationPeng Jiang, Gaurav Pandey, Srikanth Saripalli
This paper presents a novel system designed for 3D mapping and visual relocalization using 3D Gaussian Splatting. Our proposed method uses LiDAR and camera data to create accurate and visually plausible representations of the environment. By leveraging LiDAR data to initiate the training of the 3D Gaussian Splatting map, our system constructs maps that are both detailed and geometrically accurate. To mitigate excessive GPU memory usage and facilitate rapid spatial queries, we employ a combination of a 2D voxel map and a KD-tree. This preparation makes our method well-suited for visual localization tasks, enabling efficient identification of correspondences between the query image and the rendered image from the Gaussian Splatting map via normalized cross-correlation (NCC). Additionally, we refine the camera pose of the query image using feature-based matching and the Perspective-n-Point (PnP) technique. The effectiveness, adaptability, and precision of our system are demonstrated through extensive evaluation on the KITTI360 dataset.
27.8ROApr 27
Pushing Radar Odometry Beyond the Pavement: Current Capabilities and ChallengesShaunak Kolhe, Peng Jiang, Maggie Wigness et al.
Radar offers unique advantages for localization in unstructured environments, including robustness to weather, lighting, and airborne particulates. While most prior work has studied radar odometry in urban, largely planar settings, its performance in off-road environments remains less understood. In this paper, we investigate the potential of radar for off-road odometry estimation and identify key challenges that arise from full $SE(3)$ vehicle motion, terrain-induced ground returns, and sparse or unstable features. To address these issues, we introduce two simple baselines: Radar-KISSICP, which applies motion compensation to generate 3D-aware radar pointclouds, and Radar-IMU, which leverages IMU preintegration to stabilize scan matching. Experiments on the Great Outdoors (GO) dataset demonstrate that these baselines improve trajectory estimation in challenging routes and provide a reference point for future development of radar odometry in off-road robotics.
CVJun 20, 2025
AnyTraverse: An off-road traversability framework with VLM and human operator in the loopSattwik Sahu, Agamdeep Singh, Karthik Nambiar et al.
Off-road traversability segmentation enables autonomous navigation with applications in search-and-rescue, military operations, wildlife exploration, and agriculture. Current frameworks struggle due to significant variations in unstructured environments and uncertain scene changes, and are not adaptive to be used for different robot types. We present AnyTraverse, a framework combining natural language-based prompts with human-operator assistance to determine navigable regions for diverse robotic vehicles. The system segments scenes for a given set of prompts and calls the operator only when encountering previously unexplored scenery or unknown class not part of the prompt in its region-of-interest, thus reducing active supervision load while adapting to varying outdoor scenes. Our zero-shot learning approach eliminates the need for extensive data collection or retraining. Our experimental validation includes testing on RELLIS-3D, Freiburg Forest, and RUGD datasets and demonstrate real-world deployment on multiple robot platforms. The results show that AnyTraverse performs better than GA-NAV and Off-seg while offering a vehicle-agnostic approach to off-road traversability that balances automation with targeted human supervision.
ROFeb 26, 2025
Learning Autonomy: Off-Road Navigation Enhanced by Human InputAkhil Nagariya, Dimitar Filev, Srikanth Saripalli et al.
In the area of autonomous driving, navigating off-road terrains presents a unique set of challenges, from unpredictable surfaces like grass and dirt to unexpected obstacles such as bushes and puddles. In this work, we present a novel learning-based local planner that addresses these challenges by directly capturing human driving nuances from real-world demonstrations using only a monocular camera. The key features of our planner are its ability to navigate in challenging off-road environments with various terrain types and its fast learning capabilities. By utilizing minimal human demonstration data (5-10 mins), it quickly learns to navigate in a wide array of off-road conditions. The local planner significantly reduces the real world data required to learn human driving preferences. This allows the planner to apply learned behaviors to real-world scenarios without the need for manual fine-tuning, demonstrating quick adjustment and adaptability in off-road autonomous driving technology.
CVSep 30, 2025
Stylos: Multi-View 3D Stylization with Single-Forward Gaussian SplattingHanzhou Liu, Jia Huang, Mi Lu et al.
We present Stylos, a single-forward 3D Gaussian framework for 3D style transfer that operates on unposed content, from a single image to a multi-view collection, conditioned on a separate reference style image. Stylos synthesizes a stylized 3D Gaussian scene without per-scene optimization or precomputed poses, achieving geometry-aware, view-consistent stylization that generalizes to unseen categories, scenes, and styles. At its core, Stylos adopts a Transformer backbone with two pathways: geometry predictions retain self-attention to preserve geometric fidelity, while style is injected via global cross-attention to enforce visual consistency across views. With the addition of a voxel-based 3D style loss that aligns aggregated scene features to style statistics, Stylos enforces view-consistent stylization while preserving geometry. Experiments across multiple datasets demonstrate that Stylos delivers high-quality zero-shot stylization, highlighting the effectiveness of global style-content coupling, the proposed 3D style loss, and the scalability of our framework from single view to large-scale multi-view settings.
CVMar 19, 2024
Reflectivity Is All You Need!: Advancing LiDAR Semantic SegmentationKasi Viswanath, Peng Jiang, Srikanth Saripalli
LiDAR semantic segmentation frameworks predominantly use geometry-based features to differentiate objects within a scan. Although these methods excel in scenarios with clear boundaries and distinct shapes, their performance declines in environments where boundaries are indistinct, particularly in off-road contexts. To address this issue, recent advances in 3D segmentation algorithms have aimed to leverage raw LiDAR intensity readings to improve prediction precision. However, despite these advances, existing learning-based models face challenges in linking the complex interactions between raw intensity and variables such as distance, incidence angle, material reflectivity, and atmospheric conditions. Building upon our previous work, this paper explores the advantages of employing calibrated intensity (also referred to as reflectivity) within learning-based LiDAR semantic segmentation frameworks. We start by demonstrating that adding reflectivity as input enhances the LiDAR semantic segmentation model by providing a better data representation. Extensive experimentation with the Rellis-3d off-road dataset shows that replacing intensity with reflectivity results in a 4\% improvement in mean Intersection over Union (mIoU) for off-road scenarios. We demonstrate the potential benefits of using calibrated intensity for semantic segmentation in urban environments (SemanticKITTI) and for cross-sensor domain adaptation. Additionally, we tested the Segment Anything Model (SAM) using reflectivity as input, resulting in improved segmentation masks for LiDAR images.
ROMay 22, 2023
Learning Pedestrian Actions to Ensure Safe Autonomous DrivingJia Huang, Alvika Gautam, Srikanth Saripalli
To ensure safe autonomous driving in urban environments with complex vehicle-pedestrian interactions, it is critical for Autonomous Vehicles (AVs) to have the ability to predict pedestrians' short-term and immediate actions in real-time. In recent years, various methods have been developed to study estimating pedestrian behaviors for autonomous driving scenarios, but there is a lack of clear definitions for pedestrian behaviors. In this work, the literature gaps are investigated and a taxonomy is presented for pedestrian behavior characterization. Further, a novel multi-task sequence to sequence Transformer encoders-decoders (TF-ed) architecture is proposed for pedestrian action and trajectory prediction using only ego vehicle camera observations as inputs. The proposed approach is compared against an existing LSTM encoders decoders (LSTM-ed) architecture for action and trajectory prediction. The performance of both models is evaluated on the publicly available Joint Attention Autonomous Driving (JAAD) dataset, CARLA simulation data as well as real-time self-driving shuttle data collected on university campus. Evaluation results illustrate that the proposed method reaches an accuracy of 81% on action prediction task on JAAD testing data and outperforms the LSTM-ed by 7.4%, while LSTM counterpart performs much better on trajectory prediction task for a prediction sequence length of 25 frames.
ROMay 17, 2023
Improving Extrinsics between RADAR and LIDAR using LearningPeng Jiang, Srikanth Saripalli
LIDAR and RADAR are two commonly used sensors in autonomous driving systems. The extrinsic calibration between the two is crucial for effective sensor fusion. The challenge arises due to the low accuracy and sparse information in RADAR measurements. This paper presents a novel solution for 3D RADAR-LIDAR calibration in autonomous systems. The method employs simple targets to generate data, including correspondence registration and a one-step optimization algorithm. The optimization aims to minimize the reprojection error while utilizing a small multi-layer perception (MLP) to perform regression on the return energy of the sensor around the targets. The proposed approach uses a deep learning framework such as PyTorch and can be optimized through gradient descent. The experiment uses a 360-degree Ouster-128 LIDAR and a 360-degree Navtech RADAR, providing raw measurements. The results validate the effectiveness of the proposed method in achieving improved estimates of extrinsic calibration parameters.
ROOct 5, 2021
OTTR: Off-Road Trajectory Tracking using Reinforcement LearningAkhil Nagariya, Dileep Kalathil, Srikanth Saripalli
In this work, we present a novel Reinforcement Learning (RL) algorithm for the off-road trajectory tracking problem. Off-road environments involve varying terrain types and elevations, and it is difficult to model the interaction dynamics of specific off-road vehicles with such a diverse and complex environment. Standard RL policies trained on a simulator will fail to operate in such challenging real-world settings. Instead of using a naive domain randomization approach, we propose an innovative supervised-learning based approach for overcoming the sim-to-real gap problem. Our approach efficiently exploits the limited real-world data available to adapt the baseline RL policy obtained using a simple kinematics simulator. This avoids the need for modeling the diverse and complex interaction of the vehicle with off-road environments. We evaluate the performance of the proposed algorithm using two different off-road vehicles, Warthog and Moose. Compared to the standard ILQR approach, our proposed approach achieves a 30% and 50% reduction in cross track error in Warthog and Moose, respectively, by utilizing only 30 minutes of real-world driving data.
CVSep 21, 2021
SemCal: Semantic LiDAR-Camera Calibration using Neural MutualInformation EstimatorPeng Jiang, Philip Osteen, Srikanth Saripalli
This paper proposes SemCal: an automatic, targetless, extrinsic calibration algorithm for a LiDAR and camera system using semantic information. We leverage a neural information estimator to estimate the mutual information (MI) of semantic information extracted from each sensor measurement, facilitating semantic-level data association. By using a matrix exponential formulation of the $se(3)$ transformation and a kernel-based sampling method to sample from camera measurement based on LiDAR projected points, we can formulate the LiDAR-Camera calibration problem as a novel differentiable objective function that supports gradient-based optimization methods. We also introduce a semantic-based initial calibration method using 2D MI-based image registration and Perspective-n-Point (PnP) solver. To evaluate performance, we demonstrate the robustness of our method and quantitatively analyze the accuracy using a synthetic dataset. We also evaluate our algorithm qualitatively on an urban dataset (KITTI360) and an off-road dataset (RELLIS-3D) benchmark datasets using both hand-annotated ground truth labels as well as labels predicted by the state-of-the-art deep learning models, showing improvement over recent comparable calibration approaches.
ROApr 29, 2021
AutoCone: An OmniDirectional Robot for Lane-Level Cone PlacementJacob Hartzer, Srikanth Saripalli
This paper summarizes the progress in developing a rugged, low-cost, automated ground cone robot network capable of traffic delineation at lane-level precision. A holonomic omnidirectional base with a traffic delineator was developed to allow flexibility in initialization. RTK GPS was utilized to reduce minimum position error to 2 centimeters. Due to recent developments, the cost of the platform is now less than $1,600. To minimize the effects of GPS-denied environments, wheel encoders and an Extended Kalman Filter were implemented to maintain lane-level accuracy during operation and a maximum error of 1.97 meters through 50 meters with little to no GPS signal. Future work includes increasing the operational speed of the platforms, incorporating lanelet information for path planning, and cross-platform estimation.
CVApr 24, 2021
Calibrating LiDAR and Camera using Semantic Mutual informationPeng Jiang, Philip Osteen, Srikanth Saripalli
We propose an algorithm for automatic, targetless, extrinsic calibration of a LiDAR and camera system using semantic information. We achieve this goal by maximizing mutual information (MI) of semantic information between sensors, leveraging a neural network to estimate semantic mutual information, and matrix exponential for calibration computation. Using kernel-based sampling to sample data from camera measurement based on LiDAR projected points, we formulate the problem as a novel differentiable objective function which supports the use of gradient-based optimization methods. We also introduce an initial calibration method using 2D MI-based image registration. Finally, we demonstrate the robustness of our method and quantitatively analyze the accuracy on a synthetic dataset and also evaluate our algorithm qualitatively on KITTI360 and RELLIS-3D benchmark datasets, showing improvement over recent comparable approaches.
CVMar 23, 2021
OFFSEG: A Semantic Segmentation Framework For Off-Road DrivingKasi Viswanath, Kartikeya Singh, Peng Jiang et al.
Off-road image semantic segmentation is challenging due to the presence of uneven terrains, unstructured class boundaries, irregular features and strong textures. These aspects affect the perception of the vehicle from which the information is used for path planning. Current off-road datasets exhibit difficulties like class imbalance and understanding of varying environmental topography. To overcome these issues we propose a framework for off-road semantic segmentation called as OFFSEG that involves (i) a pooled class semantic segmentation with four classes (sky, traversable region, non-traversable region and obstacle) using state-of-the-art deep learning architectures (ii) a colour segmentation methodology to segment out specific sub-classes (grass, puddle, dirt, gravel, etc.) from the traversable region for better scene understanding. The evaluation of the framework is carried out on two off-road driving datasets, namely, RELLIS-3D and RUGD. We have also tested proposed framework in IISERB campus frames. The results show that OFFSEG achieves good performance and also provides detailed information on the traversable region.
ROJan 4, 2021
Path Optimization for Ground Vehicles in Off-Road TerrainTimothy Overbye, Srikanth Saripalli
We present a method for path optimization for ground vehicles in off-road environments at high speeds. This path optimization considers the kinematic constraints of the vehicle. By thinking in the actuator space we can represent these constraints as limits in the space rather than derived properties of the path. In this paper we present a actuator space approach to path optimization for off-road ground vehicles. This is done by representing and operation on the path as a list of steering angles over the path length. This transforms the set of kinematic constraints into constraints on the steering angle. We then put this path into a gradient descent solver. This produced paths that are kinematically feasible and optimized in accordance with our cost function. Finally, we tested the system both in simulation and on an off-road vehicle at speeds of 5 m/s.
ROJul 28, 2020
An Iterative LQR Controller for Off-Road and On-Road Vehicles using a Neural Network Dynamics ModelAkhil Nagariya, Srikanth Saripalli
In this work we evaluate Iterative Linear Quadratic Regulator(ILQR) for trajectory tracking of two different kinds of wheeled mobile robots namely Warthog (Fig. 1), an off-road holonomic robot with skid-steering and Polaris GEM e6 [1], a non-holonomic six seater vehicle (Fig. 2). We use multilayer neural network to learn the discrete dynamic model of these robots which is used in ILQR controller to compute the control law. We use model predictive control (MPC) to deal with model imperfections and perform extensive experiments to evaluate the performance of the controller on human driven reference trajectories with vehicle speeds of 3m/s- 4m/s for warthog and 7m/s-10m/s for the Polaris GEM
ROJul 3, 2020
Experimental Evaluation of 3D-LIDAR Camera Extrinsic CalibrationSubodh Mishra, Philip R. Osteen, Gaurav Pandey et al.
In this paper we perform an experimental comparison of three different target based 3D-LIDAR camera calibration algorithms. We briefly elucidate the mathematical background behind each method and provide insights into practical aspects like ease of data collection for all of them. We extensively evaluate these algorithms on a sensor suite which consists multiple cameras and LIDARs by assessing their robustness to random initialization and by using metrics like Mean Line Re-projection Error (MLRE) and Factory Stereo Calibration Error. We also show the effect of noisy sensor on the calibration result from all the algorithms and conclude with a note on which calibration algorithm should be used under what circumstances.
ROMar 2, 2020
Extrinsic Calibration of a 3D-LIDAR and a CameraSubodh Mishra, Gaurav Pandey, Srikanth Saripalli
This work presents an extrinsic parameter estimation algorithm between a 3D LIDAR and a Projective Camera using a marker-less planar target, by exploiting Planar Surface Point to Plane and Planar Edge Point to back-projected Plane geometric constraints. The proposed method uses the data collected by placing the planar board at different poses in the common field of view of the LIDAR and the Camera. The steps include, detection of the target and the edges of the target in LIDAR and Camera frames, matching the detected planes and lines across both the sensing modalities and finally solving a cost function formed by the aforementioned geometric constraints that link the features detected in both the LIDAR and the Camera using non-linear least squares. We have extensively validated our algorithm using two Basler Cameras, Velodyne VLP-32 and Ouster OS1 LIDARs.
CVMar 2, 2020
LiDARNet: A Boundary-Aware Domain Adaptation Model for Point Cloud Semantic SegmentationPeng Jiang, Srikanth Saripalli
We present a boundary-aware domain adaptation model for LiDAR scan full-scene semantic segmentation (LiDARNet). Our model can extract both the domain private features and the domain shared features with a two-branch structure. We embedded Gated-SCNN into the segmentor component of LiDARNet to learn boundary information while learning to predict full-scene semantic segmentation labels. Moreover, we further reduce the domain gap by inducing the model to learn a mapping between two domains using the domain shared and private features. Additionally, we introduce a new dataset (SemanticUSL\footnote{The access address of SemanticUSL:\url{https://unmannedlab.github.io/research/SemanticUSL}}) for domain adaptation for LiDAR point cloud semantic segmentation. The dataset has the same data format and ontology as SemanticKITTI. We conducted experiments on real-world datasets SemanticKITTI, SemanticPOSS, and SemanticUSL, which have differences in channel distributions, reflectivity distributions, diversity of scenes, and sensors setup. Using our approach, we can get a single projection-based LiDAR full-scene semantic segmentation model working on both domains. Our model can keep almost the same performance on the source domain after adaptation and get an 8\%-22\% mIoU performance increase in the target domain.
ROOct 18, 2019
Fast Local Planning and Mapping in Unknown Off-Road TerrainTimothy Overbye, Srikanth Saripalli
In this paper, we present a fast, on-line mapping and planning solution for operation in unknown, off-road, environments. We combine obstacle detection along with a terrain gradient map to make simple and adaptable cost map. This map can be created and updated at 10 Hz. An A* planner finds optimal paths over the map. Finally, we take multiple samples over the control input space and do a kinematic forward simulation to generated feasible trajectories. Then the most optimal trajectory, as determined by the cost map and proximity to A* path, is chosen and sent to the controller. Our method allows real time operation at rates of 30 Hz. We demonstrate the efficiency of our method in various off-road terrain at high speed.
ROOct 11, 2019
Autonomous Shuttles for Last-Mile ConnectivityGarrison Neel, Amir Darwesh, Quang Le et al.
This paper describes an autonomous shuttle which targets providing last-mile transportation. Often, this involves operation in crowded areas with high levels of pedestrian traffic, and little to no lane markings or traffic control. We aim to create a functional shuttle to be improved upon in the future as new robust solutions are developed to replace the current components. An initial implementation of such a shuttle presented, detailing the overall architecture, controller structure, waypoint following, obstacle detection and avoidance, LiDAR based sign detection, and pedestrian communication. The performance of each component is evaluated, and future improvements are discussed.
ROMay 7, 2019
Collaborative Localization for Micro Aerial VehiclesSai Vemprala, Srikanth Saripalli
In this paper, we present a framework for performing collaborative localization for groups of micro aerial vehicles (MAV) that use vision based sensing. The vehicles are each assumed to be equipped with a forward-facing monocular camera, and to be capable of communicating with each other. This collaborative localization approach is developed as a decentralized algorithm and built in a distributed fashion where individual and relative pose estimation techniques are combined for the group to localize against surrounding environments. The MAVs initially detect and match salient features between each other to create a sparse reconstruction of the observed environment, which acts as a global map. Once a map is available, each MAV performs feature detection and tracking with a robust outlier rejection process to estimate its own pose in 6 degrees of freedom. Occasionally, one or more MAVs can be tasked to compute poses for another MAV through relative measurements, which is achieved by exploiting multiple view geometry concepts. These relative measurements are then fused with individual measurements in a consistent fashion. We present the results of the algorithm on image data from MAV flights both in simulation and real life, and discuss the advantages of collaborative localization in improving pose estimation accuracy.
ROAug 1, 2018
Drone Detection Using Depth MapsAdrian Carrio, Sai Vemprala, Andres Ripoll et al.
Obstacle avoidance is a key feature for safe Unmanned Aerial Vehicle (UAV) navigation. While solutions have been proposed for static obstacle avoidance, systems enabling avoidance of dynamic objects, such as drones, are hard to implement due to the detection range and field-of-view (FOV) requirements, as well as the constraints for integrating such systems on-board small UAVs. In this work, a dataset of 6k synthetic depth maps of drones has been generated and used to train a state-of-the-art deep learning-based drone detection model. While many sensing technologies can only provide relative altitude and azimuth of an obstacle, our depth map-based approach enables full 3D localization of the obstacle. This is extremely useful for collision avoidance, as 3D localization of detected drones is key to perform efficient collision-free path planning. The proposed detection technique has been validated in several real depth map sequences, with multiple types of drones flying at up to 2 m/s, achieving an average precision of 98.7%, an average recall of 74.7% and a record detection range of 9.5 meters.
ROApr 7, 2018
Monocular Vision based Collaborative Localization for Micro Aerial Vehicle SwarmsSai Vemprala, Srikanth Saripalli
In this paper, we present a vision based collaborative localization framework for groups of micro aerial vehicles (MAV). The vehicles are each assumed to be equipped with a forward-facing monocular camera, and to be capable of communicating with each other. This collaborative localization approach is built upon a distributed algorithm where individual and relative pose estimation techniques are combined for the group to localize against surrounding environments. The MAVs initially detect and match salient features between each other to create a sparse reconstruction of the observed environment, which acts as a global map. Once a map is available, each MAV performs feature detection and tracking with a robust outlier rejection process to estimate its own six degree-of-freedom pose. Occasionally, the MAVs can also fuse relative measurements with individual measurements through feature matching and multiple-view geometry based relative pose computation. We present the implementation of this algorithm for MAVs and environments simulated within Microsoft AirSim, and discuss the results and the advantages of collaborative localization.