Ahmed H. Qureshi

RO
h-index18
35papers
1,488citations
Novelty58%
AI Score60

35 Papers

ROMay 28
Physics-informed Goal-Conditioned Reinforcement Learning under Hybrid Contact Dynamics

Vittorio Giammarino, Anastasios Manganaris, Ahmed H. Qureshi

Learning to reach arbitrary goals from sparse feedback requires agents to infer a rich notion of reachability across state--goal pairs. Goal-conditioned reinforcement learning (GCRL) tackles this challenge by learning policies that generalize across goals, but this generalization becomes increasingly difficult as the underlying dynamics become high-dimensional, hybrid, or contact-dependent. To address this issue, physics-informed GCRL (Pi-GCRL) introduces optimal-control-inspired inductive biases into goal-conditioned value learning. While Pi-GCRL methods have proven effective in navigation and object-free goal-reaching domains, their reliability in contact-rich tasks remains unclear, where contact interactions induce hybrid dynamics, mode-dependent controllability, and nonsmooth value landscapes. In this work, we show that these structural properties can cause existing Pi-GCRL methods to degrade when applied naively to contact-rich manipulation. Motivated by this analysis, we introduce contact-aware and hierarchical formulations that apply physics-informed inductive biases selectively across the manipulation problem. Our results provide a principled step toward extending Pi-GCRL to contact-rich manipulation.

ROJun 1, 2023Code
Progressive Learning for Physics-informed Neural Motion Planning

Ruiqi Ni, Ahmed H. Qureshi

Motion planning (MP) is one of the core robotics problems requiring fast methods for finding a collision-free robot motion path connecting the given start and goal states. Neural motion planners (NMPs) demonstrate fast computational speed in finding path solutions but require a huge amount of expert trajectories for learning, thus adding a significant training computational load. In contrast, recent advancements have also led to a physics-informed NMP approach that directly solves the Eikonal equation for motion planning and does not require expert demonstrations for learning. However, experiments show that the physics-informed NMP approach performs poorly in complex environments and lacks scalability in multiple scenarios and high-dimensional real robot settings. To overcome these limitations, this paper presents a novel and tractable Eikonal equation formulation and introduces a new progressive learning strategy to train neural networks without expert data in complex, cluttered, multiple high-dimensional robot motion planning scenarios. The results demonstrate that our method outperforms state-of-the-art traditional MP, data-driven NMP, and physics-informed NMP methods by a significant margin in terms of computational planning speed, path quality, and success rates. We also show that our approach scales to multiple complex, cluttered scenarios and the real robot set up in a narrow passage environment. The proposed method's videos and code implementations are available at https://github.com/ruiqini/P-NTFields.

LGMar 14, 2023Code
Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies

Daniel Lawson, Ahmed H. Qureshi

Recent work has shown the promise of creating generalist, transformer-based, models for language, vision, and sequential decision-making problems. To create such models, we generally require centralized training objectives, data, and compute. It is of interest if we can more flexibly create generalist policies by merging together multiple, task-specific, individually trained policies. In this work, we take a preliminary step in this direction through merging, or averaging, subsets of Decision Transformers in parameter space trained on different MuJoCo locomotion problems, forming multi-task models without centralized training. We also demonstrate the importance of various methodological choices when merging policies, such as utilizing common pre-trained initializations, increasing model capacity, and utilizing Fisher information for weighting parameter importance. In general, we believe research in this direction could help democratize and distribute the process that forms multi-task robotics policies. Our implementation is available at https://github.com/daniellawson9999/merging-decision-transformers.

AIMay 21, 2022
Co-design of Embodied Neural Intelligence via Constrained Evolution

Zhiquan Wang, Bedrich Benes, Ahmed H. Qureshi et al.

We introduce a novel co-design method for autonomous moving agents' shape attributes and locomotion by combining deep reinforcement learning and evolution with user control. Our main inspiration comes from evolution, which has led to wide variability and adaptation in Nature and has the potential to significantly improve design and behavior simultaneously. Our method takes an input agent with optional simple constraints such as leg parts that should not evolve or allowed ranges of changes. It uses physics-based simulation to determine its locomotion and finds a behavior policy for the input design, later used as a baseline for comparison. The agent is then randomly modified within the allowed ranges creating a new generation of several hundred agents. The generation is trained by transferring the previous policy, which significantly speeds up the training. The best-performing agents are selected, and a new generation is formed using their crossover and mutations. The next generations are then trained until satisfactory results are reached. We show a wide variety of evolved agents, and our results show that even with only 10% of changes, the overall performance of the evolved agents improves 50%. If more significant changes to the initial design are allowed, our experiments' performance improves even more to 150%. Contrary to related work, our co-design works on a single GPU and provides satisfactory results by training thousands of agents within one hour.

ROSep 5, 2023
Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning

Manav Kulshrestha, Ahmed H. Qureshi

Robotic manipulation tasks, such as object rearrangement, play a crucial role in enabling robots to interact with complex and arbitrary environments. Existing work focuses primarily on single-level rearrangement planning and, even if multiple levels exist, dependency relations among substructures are geometrically simpler, like tower stacking. We propose Structural Concept Learning (SCL), a deep learning approach that leverages graph attention networks to perform multi-level object rearrangement planning for scenes with structural dependency hierarchies. It is trained on a self-generated simulation data set with intuitive structures, works for unseen scenes with an arbitrary number of objects and higher complexity of structures, infers independent substructures to allow for task parallelization over multiple manipulators, and generalizes to the real world. We compare our method with a range of classical and model-based baselines to show that our method leverages its scene understanding to achieve better performance, flexibility, and efficiency. The dataset, supplementary details, videos, and code implementation are available at: https://manavkulshrestha.github.io/scl

ROSep 30, 2022
NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning

Ruiqi Ni, Ahmed H. Qureshi

Neural Motion Planners (NMPs) have emerged as a promising tool for solving robot navigation tasks in complex environments. However, these methods often require expert data for learning, which limits their application to scenarios where data generation is time-consuming. Recent developments have also led to physics-informed deep neural models capable of representing complex dynamical Partial Differential Equations (PDEs). Inspired by these developments, we propose Neural Time Fields (NTFields) for robot motion planning in cluttered scenarios. Our framework represents a wave propagation model generating continuous arrival time to find path solutions informed by a nonlinear first-order PDE called Eikonal Equation. We evaluate our method in various cluttered 3D environments, including the Gibson dataset, and demonstrate its ability to solve motion planning problems for 4-DOF and 6-DOF robot manipulators where the traditional grid-based Eikonal planners often face the curse of dimensionality. Furthermore, the results show that our method exhibits high success rates and significantly lower computational times than the state-of-the-art methods, including NMPs that require training data from classical planners.

ROAug 23, 2022
Robot Active Neural Sensing and Planning in Unknown Cluttered Environments

Hanwen Ren, Ahmed H. Qureshi

Active sensing and planning in unknown, cluttered environments is an open challenge for robots intending to provide home service, search and rescue, narrow-passage inspection, and medical assistance. Although many active sensing methods exist, they often consider open spaces, assume known settings, or mostly do not generalize to real-world scenarios. We present the active neural sensing approach that generates the kinematically feasible viewpoint sequences for the robot manipulator with an in-hand camera to gather the minimum number of observations needed to reconstruct the underlying environment. Our framework actively collects the visual RGBD observations, aggregates them into scene representation, and performs object shape inference to avoid unnecessary robot interactions with the environment. We train our approach on synthetic data with domain randomization and demonstrate its successful execution via sim-to-real transfer in reconstructing narrow, covered, real-world cabinet environments cluttered with unknown objects. The natural cabinet scenarios impose significant challenges for robot motion and scene reconstruction due to surrounding obstacles and low ambient lighting conditions. However, despite unfavorable settings, our method exhibits high performance compared to its baselines in terms of various environment reconstruction metrics, including planning speed, the number of viewpoints, and overall scene coverage.

ROMar 2, 2022
Model-free Neural Lyapunov Control for Safe Robot Navigation

Zikang Xiong, Joe Eappen, Ahmed H. Qureshi et al.

Model-free Deep Reinforcement Learning (DRL) controllers have demonstrated promising results on various challenging non-linear control tasks. While a model-free DRL algorithm can solve unknown dynamics and high-dimensional problems, it lacks safety assurance. Although safety constraints can be encoded as part of a reward function, there still exists a large gap between an RL controller trained with this modified reward and a safe controller. In contrast, instead of implicitly encoding safety constraints with rewards, we explicitly co-learn a Twin Neural Lyapunov Function (TNLF) with the control policy in the DRL training loop and use the learned TNLF to build a runtime monitor. Combined with the path generated from a planner, the monitor chooses appropriate waypoints that guide the learned controller to provide collision-free control trajectories. Our approach inherits the scalability advantages from DRL while enhancing safety guarantees. Our experimental evaluation demonstrates the effectiveness of our approach compared to DRL with augmented rewards and constrained DRL methods over a range of high-dimensional safety-sensitive navigation tasks.

RONov 11, 2022
Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling

Daniel Lawson, Ahmed H. Qureshi

Learning long-horizon tasks such as navigation has presented difficult challenges for successfully applying reinforcement learning to robotics. From another perspective, under known environments, sampling-based planning can robustly find collision-free paths in environments without learning. In this work, we propose Control Transformer that models return-conditioned sequences from low-level policies guided by a sampling-based Probabilistic Roadmap (PRM) planner. We demonstrate that our framework can solve long-horizon navigation tasks using only local information. We evaluate our approach on partially-observed maze navigation with MuJoCo robots, including Ant, Point, and Humanoid. We show that Control Transformer can successfully navigate through mazes and transfer to unknown environments. Additionally, we apply our method to a differential drive robot (Turtlebot3) and show zero-shot sim2real transfer under noisy observations.

ROOct 6, 2022
CoGrasp: 6-DoF Grasp Generation for Human-Robot Collaboration

Abhinav K. Keshari, Hanwen Ren, Ahmed H. Qureshi

Robot grasping is an actively studied area in robotics, mainly focusing on the quality of generated grasps for object manipulation. However, despite advancements, these methods do not consider the human-robot collaboration settings where robots and humans will have to grasp the same objects concurrently. Therefore, generating robot grasps compatible with human preferences of simultaneously holding an object becomes necessary to ensure a safe and natural collaboration experience. In this paper, we propose a novel, deep neural network-based method called CoGrasp that generates human-aware robot grasps by contextualizing human preference models of object grasping into the robot grasp selection process. We validate our approach against existing state-of-the-art robot grasping methods through simulated and real-robot experiments and user studies. In real robot experiments, our method achieves about 88\% success rate in producing stable grasps that also allow humans to interact and grasp objects simultaneously in a socially compliant manner. Furthermore, our user study with 10 independent participants indicated our approach enables a safe, natural, and socially-aware human-robot objects' co-grasping experience compared to a standard robot grasping technique.

CVSep 9, 2023
AnyPose: Anytime 3D Human Pose Forecasting via Neural Ordinary Differential Equations

Zixing Wang, Ahmed H. Qureshi

Anytime 3D human pose forecasting is crucial to synchronous real-world human-machine interaction, where the term ``anytime" corresponds to predicting human pose at any real-valued time step. However, to the best of our knowledge, all the existing methods in human pose forecasting perform predictions at preset, discrete time intervals. Therefore, we introduce AnyPose, a lightweight continuous-time neural architecture that models human behavior dynamics with neural ordinary differential equations. We validate our framework on the Human3.6M, AMASS, and 3DPW dataset and conduct a series of comprehensive analyses towards comparison with existing methods and the intersection of human pose and neural ordinary differential equations. Our results demonstrate that AnyPose exhibits high-performance accuracy in predicting future poses and takes significantly lower computational time than traditional methods in solving anytime prediction tasks.

CVJun 26, 2023
SIMMF: Semantics-aware Interactive Multiagent Motion Forecasting for Autonomous Vehicle Driving

Vidyaa Krishnan Nivash, Ahmed H. Qureshi

Autonomous vehicles require motion forecasting of their surrounding multiagents (pedestrians and vehicles) to make optimal decisions for navigation. The existing methods focus on techniques to utilize the positions and velocities of these agents and fail to capture semantic information from the scene. Moreover, to mitigate the increase in computational complexity associated with the number of agents in the scene, some works leverage Euclidean distance to prune far-away agents. However, distance-based metric alone is insufficient to select relevant agents and accurately perform their predictions. To resolve these issues, we propose the Semantics-aware Interactive Multiagent Motion Forecasting (SIMMF) method to capture semantics along with spatial information and optimally select relevant agents for motion prediction. Specifically, we achieve this by implementing a semantic-aware selection of relevant agents from the scene and passing them through an attention mechanism to extract global encodings. These encodings along with agents' local information, are passed through an encoder to obtain time-dependent latent variables for a motion policy predicting the future trajectories. Our results show that the proposed approach outperforms state-of-the-art baselines and provides more accurate and scene-consistent predictions.

ROMar 2, 2023
Co-learning Planning and Control Policies Constrained by Differentiable Logic Specifications

Zikang Xiong, Daniel Lawson, Joe Eappen et al.

Synthesizing planning and control policies in robotics is a fundamental task, further complicated by factors such as complex logic specifications and high-dimensional robot dynamics. This paper presents a novel reinforcement learning approach to solving high-dimensional robot navigation tasks with complex logic specifications by co-learning planning and control policies. Notably, this approach significantly reduces the sample complexity in training, allowing us to train high-quality policies with much fewer samples compared to existing reinforcement learning algorithms. In addition, our methodology streamlines complex specification extraction from map images and enables the efficient generation of long-horizon robot motion paths across different map layouts. Moreover, our approach also demonstrates capabilities for high-dimensional control and avoiding suboptimal policies via policy alignment. The efficacy of our approach is demonstrated through experiments involving simulated high-dimensional quadruped robot dynamics and a real-world differential drive robot (TurtleBot3) under different types of task specifications.

ROApr 14
Weakly-supervised Learning for Physics-informed Neural Motion Planning via Sparse Roadmap

Ruiqi Ni, Yuchen Liu, Ahmed H. Qureshi

The motion planning problem requires finding a collision-free path between start and goal configurations in high-dimensional, cluttered spaces. Recent learning-based methods offer promising solutions, with self-supervised physics-informed approaches such as Neural Time Fields (NTFields) solving the Eikonal equation to learn value functions without expert demonstrations. However, existing physics-informed methods struggle to scale in complex, multi-room environments, where simply increasing the number of samples cannot resolve local minima or guarantee global consistency. We propose Hierarchical Neural Time Fields (H-NTFields), a weakly-supervised framework that combines weak supervision from sparse roadmaps with physics-informed PDE regularization. The roadmap provides global topological anchors through upper and lower bounds on travel times, while PDE losses enforce local geometric fidelity and obstacle-aware propagation. Experiments on 18 Gibson environments and real robotic platforms show that H-NTFields substantially improves robustness over prior physics-informed methods, while enabling fast amortized inference through a continuous value representation.

ROJun 10, 2023
MANER: Multi-Agent Neural Rearrangement Planning of Objects in Cluttered Environments

Vivek Gupta, Praphpreet Dhir, Jeegn Dani et al.

Object rearrangement is a fundamental problem in robotics with various practical applications ranging from managing warehouses to cleaning and organizing home kitchens. While existing research has primarily focused on single-agent solutions, real-world scenarios often require multiple robots to work together on rearrangement tasks. This paper proposes a comprehensive learning-based framework for multi-agent object rearrangement planning, addressing the challenges of task sequencing and path planning in complex environments. The proposed method iteratively selects objects, determines their relocation regions, and pairs them with available robots under kinematic feasibility and task reachability for execution to achieve the target arrangement. Our experiments on a diverse range of simulated and real-world environments demonstrate the effectiveness and robustness of the proposed framework. Furthermore, results indicate improved performance in terms of traversal time and success rate compared to baseline approaches.

ROMar 11
PPGuide: Steering Diffusion Policies with Performance Predictive Guidance

Zixing Wang, Devesh K. Jha, Ahmed H. Qureshi et al.

Diffusion policies have shown to be very efficient at learning complex, multi-modal behaviors for robotic manipulation. However, errors in generated action sequences can compound over time which can potentially lead to failure. Some approaches mitigate this by augmenting datasets with expert demonstrations or learning predictive world models which might be computationally expensive. We introduce Performance Predictive Guidance (PPGuide), a lightweight, classifier-based framework that steers a pre-trained diffusion policy away from failure modes at inference time. PPGuide makes use of a novel self-supervised process: it uses attention-based multiple instance learning to automatically estimate which observation-action chunks from the policy's rollouts are relevant to success or failure. We then train a performance predictor on this self-labeled data. During inference, this predictor provides a real-time gradient to guide the policy toward more robust actions. We validated our proposed PPGuide across a diverse set of tasks from the Robomimic and MimicGen benchmarks, demonstrating consistent improvements in performance.

ROMar 19
Graph-of-Constraints Model Predictive Control for Reactive Multi-agent Task and Motion Planning

Anastasios Manganaris, Jeremy Lu, Ahmed H. Qureshi et al.

Sequences of interdependent geometric constraints are central to many multi-agent Task and Motion Planning (TAMP) problems. However, existing methods for handling such constraint sequences struggle with partially ordered tasks and dynamic agent assignments. They typically assume static assignments and cannot adapt when disturbances alter task allocations. To overcome these limitations, we introduce Graph-of-Constraints Model Predictive Control (GoC-MPC), a generalized sequence-of-constraints framework integrated with MPC. GoC-MPC naturally supports partially ordered tasks, dynamic agent coordination, and disturbance recovery. By defining constraints over tracked 3D keypoints, our method robustly solves diverse multi-agent manipulation tasks-coordinating agents and adapting online from visual observations alone, without relying on training data or environment models. Experiments demonstrate that GoC-MPC achieves higher success rates, significantly faster TAMP computation, and shorter overall paths compared to recent baselines, establishing it as an efficient and robust solution for multi-agent manipulation under real-world disturbances. Our supplementary video and code can be found at https://sites.google.com/view/goc-mpc/home .

LGDec 12, 2025
Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning

Vittorio Giammarino, Ahmed H. Qureshi

Goal-Conditioned Reinforcement Learning (GCRL) mitigates the difficulty of reward design by framing tasks as goal reaching rather than maximizing hand-crafted reward signals. In this setting, the optimal goal-conditioned value function naturally forms a quasimetric, motivating Quasimetric RL (QRL), which constrains value learning to quasimetric mappings and enforces local consistency through discrete, trajectory-based constraints. We propose Eikonal-Constrained Quasimetric RL (Eik-QRL), a continuous-time reformulation of QRL based on the Eikonal Partial Differential Equation (PDE). This PDE-based structure makes Eik-QRL trajectory-free, requiring only sampled states and goals, while improving out-of-distribution generalization. We provide theoretical guarantees for Eik-QRL and identify limitations that arise under complex dynamics. To address these challenges, we introduce Eik-Hierarchical QRL (Eik-HiQRL), which integrates Eik-QRL into a hierarchical decomposition. Empirically, Eik-HiQRL achieves state-of-the-art performance in offline goal-conditioned navigation and yields consistent gains over QRL in manipulation tasks, matching temporal-difference methods.

ROMar 9, 2024
Physics-informed Neural Motion Planning on Constraint Manifolds

Ruiqi Ni, Ahmed H. Qureshi

Constrained Motion Planning (CMP) aims to find a collision-free path between the given start and goal configurations on the kinematic constraint manifolds. These problems appear in various scenarios ranging from object manipulation to legged-robot locomotion. However, the zero-volume nature of manifolds makes the CMP problem challenging, and the state-of-the-art methods still take several seconds to find a path and require a computationally expansive path dataset for imitation learning. Recently, physics-informed motion planning methods have emerged that directly solve the Eikonal equation through neural networks for motion planning and do not require expert demonstrations for learning. Inspired by these approaches, we propose the first physics-informed CMP framework that solves the Eikonal equation on the constraint manifolds and trains neural function for CMP without expert data. Our results show that the proposed approach efficiently solves various CMP problems in both simulation and real-world, including object manipulation under orientation constraints and door opening with a high-dimensional 6-DOF robot manipulator. In these complex settings, our method exhibits high success rates and finds paths in sub-seconds, which is many times faster than the state-of-the-art CMP methods.

ROOct 23, 2025
Robust Point Cloud Reinforcement Learning via PCA-Based Canonicalization

Michael Bezick, Vittorio Giammarino, Ahmed H. Qureshi

Reinforcement Learning (RL) from raw visual input has achieved impressive successes in recent years, yet it remains fragile to out-of-distribution variations such as changes in lighting, color, and viewpoint. Point Cloud Reinforcement Learning (PC-RL) offers a promising alternative by mitigating appearance-based brittleness, but its sensitivity to camera pose mismatches continues to undermine reliability in realistic settings. To address this challenge, we propose PCA Point Cloud (PPC), a canonicalization framework specifically tailored for downstream robotic control. PPC maps point clouds under arbitrary rigid-body transformations to a unique canonical pose, aligning observations to a consistent frame, thereby substantially decreasing viewpoint-induced inconsistencies. In our experiments, we show that PPC improves robustness to unseen camera poses across challenging robotic tasks, providing a principled alternative to domain randomization.

LGSep 11, 2025
Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

Xuefeng Wang, Lei Zhang, Henglin Pu et al.

Existing reinforcement learning (RL) methods struggle with complex dynamical systems that demand interactions at high frequencies or irregular time intervals. Continuous-time RL (CTRL) has emerged as a promising alternative by replacing discrete-time Bellman recursion with differential value functions defined as viscosity solutions of the Hamilton--Jacobi--Bellman (HJB) equation. While CTRL has shown promise, its applications have been largely limited to the single-agent domain. This limitation stems from two key challenges: (i) conventional solution methods for HJB equations suffer from the curse of dimensionality (CoD), making them intractable in high-dimensional systems; and (ii) even with HJB-based learning approaches, accurately approximating centralized value functions in multi-agent settings remains difficult, which in turn destabilizes policy training. In this paper, we propose a CT-MARL framework that uses physics-informed neural networks (PINNs) to approximate HJB-based value functions at scale. To ensure the value is consistent with its differential structure, we align value learning with value-gradient learning by introducing a Value Gradient Iteration (VGI) module that iteratively refines value gradients along trajectories. This improves gradient fidelity, in turn yielding more accurate values and stronger policy learning. We evaluate our method using continuous-time variants of standard benchmarks, including multi-agent particle environment (MPE) and multi-agent MuJoCo. Our results demonstrate that our approach consistently outperforms existing continuous-time RL baselines and scales to complex multi-agent dynamics.

LGSep 8, 2025
Physics-informed Value Learner for Offline Goal-Conditioned Reinforcement Learning

Vittorio Giammarino, Ruiqi Ni, Ahmed H. Qureshi

Offline Goal-Conditioned Reinforcement Learning (GCRL) holds great promise for domains such as autonomous navigation and locomotion, where collecting interactive data is costly and unsafe. However, it remains challenging in practice due to the need to learn from datasets with limited coverage of the state-action space and to generalize across long-horizon tasks. To improve on these challenges, we propose a \emph{Physics-informed (Pi)} regularized loss for value learning, derived from the Eikonal Partial Differential Equation (PDE) and which induces a geometric inductive bias in the learned value function. Unlike generic gradient penalties that are primarily used to stabilize training, our formulation is grounded in continuous-time optimal control and encourages value functions to align with cost-to-go structures. The proposed regularizer is broadly compatible with temporal-difference-based value learning and can be integrated into existing Offline GCRL algorithms. When combined with Hierarchical Implicit Q-Learning (HIQL), the resulting method, Eikonal-regularized HIQL (Eik-HIQL), yields significant improvements in both performance and generalization, with pronounced gains in stitching regimes and large-scale navigation tasks.

ROJun 5, 2021
Motion Planning Transformers: A Motion Planning Framework for Mobile Robots

Jacob J. Johnson, Uday S. Kalra, Ankit Bhatia et al.

Fast and efficient sampling-based motion planning (SMP) is an integral component of many robotic systems, such as autonomous cars. A popular technique to improve the efficiency of these planners is to restrict search space in the planning domain. Existing algorithms define parametric functions to bound the search space, but these do not extend to non-holonomic robotic systems. Recent learning-based methods use a combination of convolutional and fully connected networks to encode the planning space. However, these methods are restricted to fixed map sizes, which are often not realistic in the real world. In this paper, we introduce a transformer-based approach, Motion Planning Transformer, to restrict the search space by learning to discern regions with a valid path from prior data. The model learns not only to restrict search spaces for simple 2D systems but also for non-holonomic robotic systems. We validate our method on various randomly generated environments with different map sizes and plan trajectories for a physical non-holonomic robot. We also provide a ROS2 plugin of our method for the Nav2 planning stack. The results show that our method reduces search space nodes by 2-12 times compared to traditional planners and has better generalizability than recent learning-based planners.

ROJun 2, 2021
NeRP: Neural Rearrangement Planning for Unknown Objects

Ahmed H. Qureshi, Arsalan Mousavian, Chris Paxton et al.

Robots will be expected to manipulate a wide variety of objects in complex and arbitrary ways as they become more widely used in human environments. As such, the rearrangement of objects has been noted to be an important benchmark for AI capabilities in recent years. We propose NeRP (Neural Rearrangement Planning), a deep learning based approach for multi-step neural object rearrangement planning which works with never-before-seen objects, that is trained on simulation data, and generalizes to the real world. We compare NeRP to several naive and model-based baselines, demonstrating that our approach is measurably better and can efficiently arrange unseen objects in fewer steps and with less planning time. Finally, we demonstrate it on several challenging rearrangement problems in the real world.

ROJan 17, 2021
MPC-MPNet: Model-Predictive Motion Planning Networks for Fast, Near-Optimal Planning under Kinodynamic Constraints

Linjun Li, Yinglong Miao, Ahmed H. Qureshi et al.

Kinodynamic Motion Planning (KMP) is to find a robot motion subject to concurrent kinematics and dynamics constraints. To date, quite a few methods solve KMP problems and those that exist struggle to find near-optimal solutions and exhibit high computational complexity as the planning space dimensionality increases. To address these challenges, we present a scalable, imitation learning-based, Model-Predictive Motion Planning Networks framework that quickly finds near-optimal path solutions with worst-case theoretical guarantees under kinodynamic constraints for practical underactuated systems. Our framework introduces two algorithms built on a neural generator, discriminator, and a parallelizable Model Predictive Controller (MPC). The generator outputs various informed states towards the given target, and the discriminator selects the best possible subset from them for the extension. The MPC locally connects the selected informed states while satisfying the given constraints leading to feasible, near-optimal solutions. We evaluate our algorithms on a range of cluttered, kinodynamically constrained, and underactuated planning problems with results indicating significant improvements in computation times, path qualities, and success rates over existing methods.

ROOct 17, 2020
Constrained Motion Planning Networks X

Ahmed H. Qureshi, Jiangeng Dong, Asfiya Baig et al.

Constrained motion planning is a challenging field of research, aiming for computationally efficient methods that can find a collision-free path on the constraint manifolds between a given start and goal configuration. These planning problems come up surprisingly frequently, such as in robot manipulation for performing daily life assistive tasks. However, few solutions to constrained motion planning are available, and those that exist struggle with high computational time complexity in finding a path solution on the manifolds. To address this challenge, we present Constrained Motion Planning Networks X (CoMPNetX). It is a neural planning approach, comprising a conditional deep neural generator and discriminator with neural gradients-based fast projection operator. We also introduce neural task and scene representations conditioned on which the CoMPNetX generates implicit manifold configurations to turbo-charge any underlying classical planner such as Sampling-based Motion Planning methods for quickly solving complex constrained planning tasks. We show that our method finds path solutions with high success rates and lower computation times than state-of-the-art traditional path-finding tools on various challenging scenarios.

ROAug 12, 2020
Dynamically Constrained Motion Planning Networks for Non-Holonomic Robots

Jacob J. Johnson, Linjun Li, Fei Liu et al.

Reliable real-time planning for robots is essential in today's rapidly expanding automated ecosystem. In such environments, traditional methods that plan by relaxing constraints become unreliable or slow-down for kinematically constrained robots. This paper describes the algorithm Dynamic Motion Planning Networks (Dynamic MPNet), an extension to Motion Planning Networks, for non-holonomic robots that address the challenge of real-time motion planning using a neural planning approach. We propose modifications to the training and planning networks that make it possible for real-time planning while improving the data efficiency of training and trained models' generalizability. We evaluate our model in simulation for planning tasks for a non-holonomic robot. We also demonstrate experimental results for an indoor navigation task using a Dubins car.

ROAug 9, 2020
Neural Manipulation Planning on Constraint Manifolds

Ahmed H. Qureshi, Jiangeng Dong, Austin Choe et al.

The presence of task constraints imposes a significant challenge to motion planning. Despite all recent advancements, existing algorithms are still computationally expensive for most planning problems. In this paper, we present Constrained Motion Planning Networks (CoMPNet), the first neural planner for multimodal kinematic constraints. Our approach comprises the following components: i) constraint and environment perception encoders; ii) neural robot configuration generator that outputs configurations on/near the constraint manifold(s), and iii) a bidirectional planning algorithm that takes the generated configurations to create a feasible robot motion trajectory. We show that CoMPNet solves practical motion planning tasks involving both unconstrained and constrained problems. Furthermore, it generalizes to new unseen locations of the objects, i.e., not seen during training, in the given environments with high success rates. When compared to the state-of-the-art constrained motion planning algorithms, CoMPNet outperforms by order of magnitude improvement in computational speed with a significantly lower variance.

ROJul 13, 2019
Motion Planning Networks: Bridging the Gap Between Learning-based and Classical Motion Planners

Ahmed H. Qureshi, Yinglong Miao, Anthony Simeonov et al.

This paper describes Motion Planning Networks (MPNet), a computationally efficient, learning-based neural planner for solving motion planning problems. MPNet uses neural networks to learn general near-optimal heuristics for path planning in seen and unseen environments. It takes environment information such as raw point-cloud from depth sensors, as well as a robot's initial and desired goal configurations and recursively calls itself to bidirectionally generate connectable paths. In addition to finding directly connectable and near-optimal paths in a single pass, we show that worst-case theoretical guarantees can be proven if we merge this neural network strategy with classical sample-based planners in a hybrid approach while still retaining significant computational and optimality improvements. To train the MPNet models, we present an active continual learning approach that enables MPNet to learn from streaming data and actively ask for expert demonstrations when needed, drastically reducing data for training. We validate MPNet against gold-standard and state-of-the-art planning methods in a variety of problems from 2D to 7D robot configuration spaces in challenging and cluttered environments, with results showing significant and consistently stronger performance metrics, and motivating neural planning in general as a modern strategy for solving motion planning problems efficiently.

LGMay 25, 2019
Composing Task-Agnostic Policies with Deep Reinforcement Learning

Ahmed H. Qureshi, Jacob J. Johnson, Yuzhe Qin et al.

The composition of elementary behaviors to solve challenging transfer learning problems is one of the key elements in building intelligent machines. To date, there has been plenty of work on learning task-specific policies or skills but almost no focus on composing necessary, task-agnostic skills to find a solution to new problems. In this paper, we propose a novel deep reinforcement learning-based skill transfer and composition method that takes the agent's primitive policies to solve unseen tasks. We evaluate our method in difficult cases where training policy through standard reinforcement learning (RL) or even hierarchical RL is either not feasible or exhibits high sample complexity. We show that our method not only transfers skills to new problem settings but also solves the challenging environments requiring both task planning and motion control with high data efficiency.

ROApr 25, 2019
Neural Path Planning: Fixed Time, Near-Optimal Path Generation via Oracle Imitation

Mayur J. Bency, Ahmed H. Qureshi, Michael C. Yip

Fast and efficient path generation is critical for robots operating in complex environments. This motion planning problem is often performed in a robot's actuation or configuration space, where popular pathfinding methods such as A*, RRT*, get exponentially more computationally expensive to execute as the dimensionality increases or the spaces become more cluttered and complex. On the other hand, if one were to save the entire set of paths connecting all pair of locations in the configuration space a priori, one would run out of memory very quickly. In this work, we introduce a novel way of producing fast and optimal motion plans for static environments by using a stepping neural network approach, called OracleNet. OracleNet uses Recurrent Neural Networks to determine end-to-end trajectories in an iterative manner that implicitly generates optimal motion plans with minimal loss in performance in a compact form. The algorithm is straightforward in implementation while consistently generating near-optimal paths in a single, iterative, end-to-end roll-out. In practice, OracleNet generally has fixed-time execution regardless of the configuration space complexity while outperforming popular pathfinding algorithms in complex environments and higher dimensions

ROSep 26, 2018
Deeply Informed Neural Sampling for Robot Motion Planning

Ahmed H. Qureshi, Michael C. Yip

Sampling-based Motion Planners (SMPs) have become increasingly popular as they provide collision-free path solutions regardless of obstacle geometry in a given environment. However, their computational complexity increases significantly with the dimensionality of the motion planning problem. Adaptive sampling is one of the ways to speed up SMPs by sampling a particular region of a configuration space that is more likely to contain an optimal path solution. Although there are a wide variety of algorithms for adaptive sampling, they rely on hand-crafted heuristics; furthermore, their performance decreases significantly in high-dimensional spaces. In this paper, we present a neural network-based adaptive sampler for motion planning called Deep Sampling-based Motion Planner (DeepSMP). DeepSMP generates samples for SMPs and enhances their overall speed significantly while exhibiting efficient scalability to higher-dimensional problems. DeepSMP's neural architecture comprises of a Contractive AutoEncoder which encodes given workspaces directly from a raw point cloud data, and a Dropout-based stochastic deep feedforward neural network which takes the workspace encoding, start and goal configuration, and iteratively generates feasible samples for SMPs to compute end-to-end collision-free optimal paths. DeepSMP is not only consistently computationally efficient in all tested environments but has also shown remarkable generalization to completely unseen environments. We evaluate DeepSMP on multiple planning problems including planning of a point-mass robot, rigid-body, 6-link robotic manipulator in various 2D and 3D environments. The results show that on average our method is at least 7 times faster in point-mass and rigid-body case and about 28 times faster in 6-link robot case than the existing state-of-the-art.

LGSep 17, 2018
Adversarial Imitation via Variational Inverse Reinforcement Learning

Ahmed H. Qureshi, Byron Boots, Michael C. Yip

We consider a problem of learning the reward and policy from expert examples under unknown dynamics. Our proposed method builds on the framework of generative adversarial networks and introduces the empowerment-regularized maximum-entropy inverse reinforcement learning to learn near-optimal rewards and policies. Empowerment-based regularization prevents the policy from overfitting to expert demonstrations, which advantageously leads to more generalized behaviors that result in learning near-optimal rewards. Our method simultaneously learns empowerment through variational information maximization along with the reward and policy under the adversarial learning formulation. We evaluate our approach on various high-dimensional complex control tasks. We also test our learned rewards in challenging transfer learning problems where training and testing environments are made to be different from each other in terms of dynamics or structure. The results show that our proposed method not only learns near-optimal rewards and policies that are matching expert behavior but also performs significantly better than state-of-the-art inverse reinforcement learning algorithms.

ROJul 22, 2018
Potentially Guided Bidirectionalized RRT* for Fast Optimal Path Planning in Cluttered Environments

Zaid Tahir, Ahmed H. Qureshi, Yasar Ayaz et al.

Rapidly-exploring Random Tree star (RRT*) has recently gained immense popularity in the motion planning community as it provides a probabilistically complete and asymptotically optimal solution without requiring the complete information of the obstacle space. In spite of all of its advantages, RRT* converges to an optimal solution very slowly. Hence to improve the convergence rate, its bidirectional variants were introduced, the Bi-directional RRT* (B-RRT*) and Intelligent Bi-directional RRT* (IB-RRT*). However, as both variants perform pure exploration, they tend to suffer in highly cluttered environments. In order to overcome these limitations, we introduce a new concept of potentially guided bidirectional trees in our proposed Potentially Guided Intelligent Bi-directional RRT* (PIB-RRT*) and Potentially Guided Bi-directional RRT* (PB-RRT*). The proposed algorithms greatly improve the convergence rate and have a more efficient memory utilization. Theoretical and experimental evaluation of the proposed algorithms have been made and compared to the latest state of the art motion planning algorithms under different challenging environmental conditions and have proven their remarkable improvement in efficiency and convergence rate.

ROJun 14, 2018
Motion Planning Networks

Ahmed H. Qureshi, Anthony Simeonov, Mayur J. Bency et al.

Fast and efficient motion planning algorithms are crucial for many state-of-the-art robotics applications such as self-driving cars. Existing motion planning methods become ineffective as their computational complexity increases exponentially with the dimensionality of the motion planning problem. To address this issue, we present Motion Planning Networks (MPNet), a neural network-based novel planning algorithm. The proposed method encodes the given workspaces directly from a point cloud measurement and generates the end-to-end collision-free paths for the given start and goal configurations. We evaluate MPNet on various 2D and 3D environments including the planning of a 7 DOF Baxter robot manipulator. The results show that MPNet is not only consistently computationally efficient in all environments but also generalizes to completely unseen environments. The results also show that the computation time of MPNet consistently remains less than 1 second in all presented experiments, which is significantly lower than existing state-of-the-art motion planning algorithms.