Vikas Dhiman

h-index11

16papers

340citations

Novelty50%

AI Score36

Ranked #98,412 of 194,257 authors (top 51%)#2,952 in RO (top 44%)

16 Papers

15.5ROFeb 7, 2018Code

A Critical Investigation of Deep Reinforcement Learning for Navigation

Vikas Dhiman, Shurjo Banerjee, Brent Griffin et al.

The navigation problem is classically approached in two steps: an exploration step, where map-information about the environment is gathered; and an exploitation step, where this information is used to navigate efficiently. Deep reinforcement learning (DRL) algorithms, alternatively, approach the problem of navigation in an end-to-end fashion. Inspired by the classical approach, we ask whether DRL algorithms are able to inherently explore, gather and exploit map-information over the course of navigation. We build upon Mirowski et al. [2017] work and introduce a systematic suite of experiments that vary three parameters: the agent's starting location, the agent's target location, and the maze structure. We choose evaluation metrics that explicitly measure the algorithm's ability to gather and exploit map-information. Our experiments show that when trained and tested on the same maps, the algorithm successfully gathers and exploits map-information. However, when trained and tested on different sets of maps, the algorithm fails to transfer the ability to gather and exploit map-information to unseen maps. Furthermore, we find that when the goal location is randomized and the map is kept static, the algorithm is able to gather and exploit map-information but the exploitation is far from optimal. We open-source our experimental suite in the hopes that it serves as a framework for the comparison of future algorithms and leads to the discovery of robust alternatives to classical navigation methods.

2.0CVMar 13, 2024Code

FogGuard: guarding YOLO against fog using perceptual loss

Soheil Gharatappeh, Sepideh Neshatfar, Salimeh Yasaei Sekeh et al.

In this paper, we present FogGuard, a novel fog-aware object detection network designed to address the challenges posed by foggy weather conditions. Autonomous driving systems heavily rely on accurate object detection algorithms, but adverse weather conditions can significantly impact the reliability of deep neural networks (DNNs). Existing approaches include image enhancement techniques like IA-YOLO and domain adaptation methods. While image enhancement aims to generate clear images from foggy ones, which is more challenging than object detection in foggy images, domain adaptation does not require labeled data in the target domain. Our approach involves fine-tuning on a specific dataset to address these challenges efficiently. FogGuard compensates for foggy conditions in the scene, ensuring robust performance by incorporating YOLOv3 as the baseline algorithm and introducing a unique Teacher-Student Perceptual loss for accurate object detection in foggy environments. Through comprehensive evaluations on standard datasets like PASCAL VOC and RTTS, our network significantly improves performance, achieving a 69.43\% mAP compared to YOLOv3's 57.78\% on the RTTS dataset. Additionally, we demonstrate that while our training method slightly increases time complexity, it doesn't add overhead during inference compared to the regular YOLO network.

2.3SPJan 8, 2025

DAREK -- Distance Aware Error for Kolmogorov Networks

Masoud Ataei, Mohammad Javad Khojasteh, Vikas Dhiman

In this paper, we provide distance-aware error bounds for Kolmogorov Arnold Networks (KANs). We call our new error bounds estimator DAREK -- Distance Aware Error for Kolmogorov networks. Z. Liu et al. provide error bounds, which may be loose, lack distance-awareness, and are defined only up to an unknown constant of proportionality. We review the error bounds for Newton's polynomial, which is then generalized to an arbitrary spline, under Lipschitz continuity assumptions. We then extend these bounds to nested compositions of splines, arriving at error bounds for KANs. We evaluate our method by estimating an object's shape from sparse laser scan points. We use KAN to fit a smooth function to the scans and provide error bounds for the fit. We find that our method is faster than Monte Carlo approaches, and that our error bounds enclose the true obstacle shape reliably.

6.2CVApr 15, 2025

Weather-Aware Object Detection Transformer for Domain Adaptation

Soheil Gharatappeh, Salimeh Sekeh, Vikas Dhiman

RT-DETRs have shown strong performance across various computer vision tasks but are known to degrade under challenging weather conditions such as fog. In this work, we investigate three novel approaches to enhance RT-DETR robustness in foggy environments: (1) Domain Adaptation via Perceptual Loss, which distills domain-invariant features from a teacher network to a student using perceptual supervision; (2) Weather Adaptive Attention, which augments the attention mechanism with fog-sensitive scaling by introducing an auxiliary foggy image stream; and (3) Weather Fusion Encoder, which integrates a dual-stream encoder architecture that fuses clear and foggy image features via multi-head self and cross-attention. Despite the architectural innovations, none of the proposed methods consistently outperform the baseline RT-DETR. We analyze the limitations and potential causes, offering insights for future research in weather-aware object detection.

7.1ROJun 30, 2024

DADEE: Well-calibrated uncertainty quantification in neural networks for barriers-based robot safety

Masoud Ataei, Vikas Dhiman

Uncertainty-aware controllers that guarantee safety are critical for safety critical applications. Among such controllers, Control Barrier Functions (CBFs) based approaches are popular because they are fast, yet safe. However, most such works depend on Gaussian Processes (GPs) or MC-Dropout for learning and uncertainty estimation, and both approaches come with drawbacks: GPs are non-parametric methods that are slow, while MC-Dropout does not capture aleatoric uncertainty. On the other hand, modern Bayesian learning algorithms have shown promise in uncertainty quantification. The application of modern Bayesian learning methods to CBF-based controllers has not yet been studied. We aim to fill this gap by surveying uncertainty quantification algorithms and evaluating them on CBF-based safe controllers. We find that model variance-based algorithms (for example, Deep ensembles, MC-dropout, etc.) and direct estimation-based algorithms (such as DEUP) have complementary strengths. Algorithms in the former category can only estimate uncertainty accurately out-of-domain, while those in the latter category can only do so in-domain. We combine the two approaches to obtain more accurate uncertainty estimates both in- and out-of-domain. As measured by the failure rate of a simulated robot, this results in a safer CBF-based robot controller.

13.5ROFeb 19, 2022

Safe Control Synthesis with Uncertain Dynamics and Constraints

Kehan Long, Vikas Dhiman, Melvin Leok et al.

This paper considers safe control synthesis for dynamical systems with either probabilistic or worst-case uncertainty in both the dynamics model and the safety constraints. We formulate novel probabilistic and robust (worst-case) control Lyapunov function (CLF) and control barrier function (CBF) constraints that take into account the effect of uncertainty in either case. We show that either the probabilistic or the robust (worst-case) formulation leads to a second-order cone program (SOCP), which enables efficient safe and stable control synthesis. We evaluate our approach in PyBullet simulations of an autonomous robot navigating in unknown environments and compare the performance with a baseline CLF-CBF quadratic programming approach.

3.1LGJan 1, 2021

Inverse reinforcement learning for autonomous navigation via differentiable semantic mapping and planning

Tianyu Wang, Vikas Dhiman, Nikolay Atanasov

This paper focuses on inverse reinforcement learning for autonomous navigation using distance and semantic category observations. The objective is to infer a cost function that explains demonstrated behavior while relying only on the expert's observations and state-control trajectory. We develop a map encoder, that infers semantic category probabilities from the observation sequence, and a cost encoder, defined as a deep neural network over the semantic features. Since the expert cost is not directly observable, the model parameters can only be optimized by differentiating the error between demonstrated controls and a control policy computed from the cost estimate. We propose a new model of expert behavior that enables error minimization using a closed-form subgradient computed only over a subset of promising states via a motion planning algorithm. Our approach allows generalizing the learned behavior to new environments with new spatial configurations of the semantic categories. We analyze the different components of our model in a minigrid environment. We also demonstrate that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of buildings, sidewalks, and road lanes.

16.1SYDec 29, 2020Code

Control Barriers in Bayesian Learning of System Dynamics

Vikas Dhiman, Mohammad Javad Khojasteh, Massimo Franceschetti et al.

This paper focuses on learning a model of system dynamics online while satisfying safety constraints. Our objective is to avoid offline system identification or hand-specified models and allow a system to safely and autonomously estimate and adapt its own model during operation. Given streaming observations of the system state, we use Bayesian learning to obtain a distribution over the system dynamics. Specifically, we propose a new matrix variate Gaussian process (MVGP) regression approach with an efficient covariance factorization to learn the drift and input gain terms of a nonlinear control-affine system. The MVGP distribution is then used to optimize the system behavior and ensure safety with high probability, by specifying control Lyapunov function (CLF) and control barrier function (CBF) chance constraints. We show that a safe control policy can be synthesized for systems with arbitrary relative degree and probabilistic CLF-CBF constraints by solving a second order cone program (SOCP). Finally, we extend our design to a self-triggering formulation, adaptively determining the time at which a new control input needs to be applied in order to guarantee safety.

8.3ROJul 29, 2020Code

OrcVIO: Object residual constrained Visual-Inertial Odometry

Mo Shan, Vikas Dhiman, Qiaojun Feng et al.

Introducing object-level semantic information into simultaneous localization and mapping (SLAM) system is critical. It not only improves the performance but also enables tasks specified in terms of meaningful objects. This work presents OrcVIO, for visual-inertial odometry tightly coupled with tracking and optimization over structured object models. OrcVIO differentiates through semantic feature and bounding-box reprojection errors to perform batch optimization over the pose and shape of objects. The estimated object states aid in real-time incremental optimization over the IMU-camera states. The ability of OrcVIO for accurate trajectory estimation and large-scale object-level mapping is evaluated using real data.

4.2LGJun 9, 2020

Learning Navigation Costs from Demonstration with Semantic Observations

Tianyu Wang, Vikas Dhiman, Nikolay Atanasov

This paper focuses on inverse reinforcement learning (IRL) for autonomous robot navigation using semantic observations. The objective is to infer a cost function that explains demonstrated behavior while relying only on the expert's observations and state-control trajectory. We develop a map encoder, which infers semantic class probabilities from the observation sequence, and a cost encoder, defined as deep neural network over the semantic features. Since the expert cost is not directly observable, the representation parameters can only be optimized by differentiating the error between demonstrated controls and a control policy computed from the cost estimate. The error is optimized using a closed-form subgradient computed only over a subset of promising states via a motion planning algorithm. We show that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of cars, sidewalks and road lanes.

5.0LGFeb 26, 2020

Learning Navigation Costs from Demonstration in Partially Observable Environments

Tianyu Wang, Vikas Dhiman, Nikolay Atanasov

This paper focuses on inverse reinforcement learning (IRL) to enable safe and efficient autonomous navigation in unknown partially observable environments. The objective is to infer a cost function that explains expert-demonstrated navigation behavior while relying only on the observations and state-control trajectory used by the expert. We develop a cost function representation composed of two parts: a probabilistic occupancy encoder, with recurrent dependence on the observation sequence, and a cost encoder, defined over the occupancy features. The representation parameters are optimized by differentiating the error between demonstrated controls and a control policy computed from the cost encoder. Such differentiation is typically computed by dynamic programming through the value function over the whole state space. We observe that this is inefficient in large partially observable environments because most states are unexplored. Instead, we rely on a closed-form subgradient of the cost-to-go obtained only over a subset of promising states via an efficient motion-planning algorithm such as A* or RRT. Our experiments show that our model exceeds the accuracy of baseline IRL algorithms in robot navigation tasks, while substantially improving the efficiency of training and test-time inference.

21.2RODec 20, 2019Code

Probabilistic Safety Constraints for Learned High Relative Degree System Dynamics

Mohammad Javad Khojasteh, Vikas Dhiman, Massimo Franceschetti et al.

This paper focuses on learning a model of system dynamics online while satisfying safety constraints.Our motivation is to avoid offline system identification or hand-specified dynamics models and allowa system to safely and autonomously estimate and adapt its own model during online operation.Given streaming observations of the system state, we use Bayesian learning to obtain a distributionover the system dynamics. In turn, the distribution is used to optimize the system behavior andensure safety with high probability, by specifying a chance constraint over a control barrier function.

3.5RODec 4, 2019

Learning from Interventions using Hierarchical Policies for Safe Learning

Jing Bi, Vikas Dhiman, Tianyou Xiao et al.

Learning from Demonstrations (LfD) via Behavior Cloning (BC) works well on multiple complex tasks. However, a limitation of the typical LfD approach is that it requires expert demonstrations for all scenarios, including those in which the algorithm is already well-trained. The recently proposed Learning from Interventions (LfI) overcomes this limitation by using an expert overseer. The expert overseer only intervenes when it suspects that an unsafe action is about to be taken. Although LfI significantly improves over LfD, the state-of-the-art LfI fails to account for delay caused by the expert's reaction time and only learns short-term behavior. We address these limitations by 1) interpolating the expert's interventions back in time, and 2) by splitting the policy into two hierarchical levels, one that generates sub-goals for the future and another that generates actions to reach those desired sub-goals. This sub-goal prediction forces the algorithm to learn long-term behavior while also being robust to the expert's reaction time. Our experiments show that LfI using sub-goals in a hierarchical policy framework trains faster and achieves better asymptotic performance than typical LfD.

5.7LGSep 25, 2018

Floyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New Goals

Vikas Dhiman, Shurjo Banerjee, Jeffrey M. Siskind et al.

Consider mutli-goal tasks that involve static environments and dynamic goals. Examples of such tasks, such as goal-directed navigation and pick-and-place in robotics, abound. Two types of Reinforcement Learning (RL) algorithms are used for such tasks: model-free or model-based. Each of these approaches has limitations. Model-free RL struggles to transfer learned information when the goal location changes, but achieves high asymptotic accuracy in single goal tasks. Model-based RL can transfer learned information to new goal locations by retaining the explicitly learned state-dynamics, but is limited by the fact that small errors in modelling these dynamics accumulate over long-term planning. In this work, we improve upon the limitations of model-free RL in multi-goal domains. We do this by adapting the Floyd-Warshall algorithm for RL and call the adaptation Floyd-Warshall RL (FWRL). The proposed algorithm learns a goal-conditioned action-value function by constraining the value of the optimal path between any two states to be greater than or equal to the value of paths via intermediary states. Experimentally, we show that FWRL is more sample-efficient and learns higher reward strategies in multi-goal tasks as compared to Q-learning, model-based RL and other relevant baselines in a tabular domain.

7.9ROApr 12, 2016

Spatiotemporal Articulated Models for Dynamic SLAM

Suren Kumar, Vikas Dhiman, Madan Ravi Ganesh et al.

We propose an online spatiotemporal articulation model estimation framework that estimates both articulated structure as well as a temporal prediction model solely using passive observations. The resulting model can predict future mo- tions of an articulated object with high confidence because of the spatial and temporal structure. We demonstrate the effectiveness of the predictive model by incorporating it within a standard simultaneous localization and mapping (SLAM) pipeline for mapping and robot localization in previously unexplored dynamic environments. Our method is able to localize the robot and map a dynamic scene by explaining the observed motion in the world. We demonstrate the effectiveness of the proposed framework for both simulated and real-world dynamic environments.

8.6ROJan 16, 2013Code

Mutual Localization: Two Camera Relative 6-DOF Pose Estimation from Reciprocal Fiducial Observation

Vikas Dhiman, Julian Ryde, Jason J. Corso

Concurrently estimating the 6-DOF pose of multiple cameras or robots---cooperative localization---is a core problem in contemporary robotics. Current works focus on a set of mutually observable world landmarks and often require inbuilt egomotion estimates; situations in which both assumptions are violated often arise, for example, robots with erroneous low quality odometry and IMU exploring an unknown environment. In contrast to these existing works in cooperative localization, we propose a cooperative localization method, which we call mutual localization, that uses reciprocal observations of camera-fiducials to obviate the need for egomotion estimates and mutually observable world landmarks. We formulate and solve an algebraic formulation for the pose of the two camera mutual localization setup under these assumptions. Our experiments demonstrate the capabilities of our proposal egomotion-free cooperative localization method: for example, the method achieves 2cm range and 0.7 degree accuracy at 2m sensing for 6-DOF pose. To demonstrate the applicability of the proposed work, we deploy our method on Turtlebots and we compare our results with ARToolKit and Bundler, over which our method achieves a 10 fold improvement in translation estimation accuracy.