Rooholla Khorrambakht

h-index4

6papers

38citations

Novelty52%

AI Score43

Ranked #55,939 of 194,257 authors (top 29%)#12,732 in LG (top 32%)

6 Papers

12.3ROApr 21

Open-Architecture End-to-End System for Real-World Autonomous Robot Navigation

Venkata Naren Devarakonda, Ali Umut Kaypak, Raktim Gautam Goswami et al.

Enabling robots to autonomously navigate unknown, complex, and dynamic real-world environments presents several challenges, including imperfect perception, partial observability, localization uncertainty, and safety constraints. Current approaches are typically limited to simulations, where such challenges are not present. In this work, we present a lightweight, open-architecture, end-to-end system for real-world robot autonomous navigation. Specifically, we deploy a real-time navigation system on a quadruped robot by integrating multiple onboard components that communicate via ROS2. Given navigation tasks specified in natural language, the system fuses onboard sensory data for localization and mapping with open-vocabulary semantics to build hierarchical scene graphs from a continuously updated semantic object map. An LLM-based planner leverages these graphs to generate and adapt multi-step plans in real time as the scene evolves. Through experiments across multiple indoor environments using a Unitree Go2 quadruped, we demonstrate zero-shot real-world autonomous navigation, achieving over 88% task success, and provide analysis of system behavior during deployment.

2.2ROFeb 5

Coupled Local and Global World Models for Efficient First Order RL

Joseph Amigo, Rooholla Khorrambakht, Nicolas Mansard et al.

World models offer a promising avenue for more faithfully capturing complex dynamics, including contacts and non-rigidity, as well as complex sensory information, such as visual perception, in situations where standard simulators struggle. However, these models are computationally complex to evaluate, posing a challenge for popular RL approaches that have been successfully used with simulators to solve complex locomotion tasks but yet struggle with manipulation. This paper introduces a method that bypasses simulators entirely, training RL policies inside world models learned from robots' interactions with real environments. At its core, our approach enables policy training with large-scale diffusion models via a novel decoupled first-order gradient (FoG) method: a full-scale world model generates accurate forward trajectories, while a lightweight latent-space surrogate approximates its local dynamics for efficient gradient computation. This coupling of a local and global world model ensures high-fidelity unrolling alongside computationally tractable differentiation. We demonstrate the efficacy of our method on the Push-T manipulation task, where it significantly outperforms PPO in sample efficiency. We further evaluate our approach through an ego-centric object manipulation task with a quadruped. Together, these results demonstrate that learning inside data-driven world models is a promising pathway for solving hard-to-model RL tasks in image space without reliance on hand-crafted physics simulators.

1.6LGJul 1, 2021

A Consistency-Based Loss for Deep Odometry Through Uncertainty Propagation

Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad et al.

The incremental poses computed through odometry can be integrated over time to calculate the pose of a device with respect to an initial location. The resulting global pose may be used to formulate a second, consistency based, loss term in a deep odometry setting. In such cases where multiple losses are imposed on a network, the uncertainty over each output can be derived to weigh the different loss terms in a maximum likelihood setting. However, when imposing a constraint on the integrated transformation, due to how only odometry is estimated at each iteration of the algorithm, there is no information about the uncertainty associated with the global pose to weigh the global loss term. In this paper, we associate uncertainties with the output poses of a deep odometry network and propagate the uncertainties through each iteration. Our goal is to use the estimated covariance matrix at each incremental step to weigh the loss at the corresponding step while weighting the global loss term using the compounded uncertainty. This formulation provides an adaptive method to weigh the incremental and integrated loss terms against each other, noting the increase in uncertainty as new estimates arrive. We provide quantitative and qualitative analysis of pose estimates and show that our method surpasses the accuracy of the state-of-the-art Visual Odometry approaches. Then, uncertainty estimates are evaluated and comparisons against fixed baselines are provided. Finally, the uncertainty values are used in a realistic example to show the effectiveness of uncertainty quantification for localization.

3.1LGJan 18, 2021

Deep Inertial Odometry with Accurate IMU Preintegration

Rooholla Khorrambakht, Chris Xiaoxuan Lu, Hamed Damirchi et al.

Inertial Measurement Units (IMUs) are interceptive modalities that provide ego-motion measurements independent of the environmental factors. They are widely adopted in various autonomous systems. Motivated by the limitations in processing the noisy measurements from these sensors using their mathematical models, researchers have recently proposed various deep learning architectures to estimate inertial odometry in an end-to-end manner. Nevertheless, the high-frequency and redundant measurements from IMUs lead to long raw sequences to be processed. In this study, we aim to investigate the efficacy of accurate preintegration as a more realistic solution to the IMU motion model for deep inertial odometry (DIO) and the resultant DIO is a fusion of model-driven and data-driven approaches. The accurate IMU preintegration has the potential to outperform numerical approximation of the continuous IMU model used in the existing DIOs. Experimental results validate the proposed DIO.

4.2CVNov 17, 2020

Exploring Self-Attention for Visual Odometry

Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad

Visual odometry networks commonly use pretrained optical flow networks in order to derive the ego-motion between consecutive frames. The features extracted by these networks represent the motion of all the pixels between frames. However, due to the existence of dynamic objects and texture-less surfaces in the scene, the motion information for every image region might not be reliable for inferring odometry due to the ineffectiveness of dynamic objects in derivation of the incremental changes in position. Recent works in this area lack attention mechanisms in their structures to facilitate dynamic reweighing of the feature maps for extracting more refined egomotion information. In this paper, we explore the effectiveness of self-attention in visual odometry. We report qualitative and quantitative results against the SOTA methods. Furthermore, saliency-based studies alongside specially designed experiments are utilized to investigate the effect of self-attention on VO. Our experiments show that using self-attention allows for the extraction of better features while achieving a better odometry performance compared to networks that lack such structures.

3.3LGJul 6, 2020

ARC-Net: Activity Recognition Through Capsules

Hamed Damirchi, Rooholla Khorrambakht, Hamid Taghirad

Human Activity Recognition (HAR) is a challenging problem that needs advanced solutions than using handcrafted features to achieve a desirable performance. Deep learning has been proposed as a solution to obtain more accurate HAR systems being robust against noise. In this paper, we introduce ARC-Net and propose the utilization of capsules to fuse the information from multiple inertial measurement units (IMUs) to predict the activity performed by the subject. We hypothesize that this network will be able to tune out the unnecessary information and will be able to make more accurate decisions through the iterative mechanism embedded in capsule networks. We provide heatmaps of the priors, learned by the network, to visualize the utilization of each of the data sources by the trained network. By using the proposed network, we were able to increase the accuracy of the state-of-the-art approaches by 2%. Furthermore, we investigate the directionality of the confusion matrices of our results and discuss the specificity of the activities based on the provided data.