Pavan Mantripragada

h-index2

5papers

25citations

Novelty55%

AI Score47

Ranked #31,706 of 194,257 authors (top 16%)#819 in RO (top 12%)

5 Papers

12.5ROOct 12, 2023

AcTExplore: Active Tactile Exploration of Unknown Objects

Amir-Hossein Shahidzadeh, Seong Jong Yoo, Pavan Mantripragada et al.

Tactile exploration plays a crucial role in understanding object structures for fundamental robotics tasks such as grasping and manipulation. However, efficiently exploring such objects using tactile sensors is challenging, primarily due to the large-scale unknown environments and limited sensing coverage of these sensors. To this end, we present AcTExplore, an active tactile exploration method driven by reinforcement learning for object reconstruction at scales that automatically explores the object surfaces in a limited number of steps. Through sufficient exploration, our algorithm incrementally collects tactile data and reconstructs 3D shapes of the objects as well, which can serve as a representation for higher-level downstream tasks. Our method achieves an average of 95.97% IoU coverage on unseen YCB objects while just being trained on primitive shapes. Project Webpage: https://prg.cs.umd.edu/AcTExplore

2.2ROFeb 13

Monocular Reconstruction of Neural Tactile Fields

Pavan Mantripragada, Siddhanth Deshmukh, Eadom Dessalene et al.

Robots operating in the real world must plan through environments that deform, yield, and reconfigure under contact, requiring interaction-aware 3D representations that extend beyond static geometric occupancy. To address this, we introduce neural tactile fields, a novel 3D representation that maps spatial locations to the expected tactile response upon contact. Our model predicts these neural tactile fields from a single monocular RGB image -- the first method to do so. When integrated with off-the-shelf path planners, neural tactile fields enable robots to generate paths that avoid high-resistance objects while deliberately routing through low-resistance regions (e.g. foliage), rather than treating all occupied space as equally impassable. Empirically, our learning framework improves volumetric 3D reconstruction by $85.8\%$ and surface reconstruction by $26.7\%$ compared to state-of-the-art monocular 3D reconstruction methods (LRM and Direct3D).

7.7CVMar 16

FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding

Eadom Dessalene, Botao He, Michael Maynord et al.

We introduce FEEL (Force-Enhanced Egocentric Learning), the first large-scale dataset pairing force measurements gathered from custom piezoresistive gloves with egocentric video. Our gloves enable scalable data collection, and FEEL contains approximately 3 million force-synchronized frames of natural unscripted manipulation in kitchen environments, with 45% of frames involving hand-object contact. Because force is the underlying cause that drives physical interaction, it is a critical primitive for physical action understanding. We demonstrate the utility of force for physical action understanding through application of FEEL to two families of tasks: (1) contact understanding, where we jointly perform temporal contact segmentation and pixel-level contacted object segmentation; and, (2) action representation learning, where force prediction serves as a self-supervised pretraining objective for video backbones. We achieve state-of-the-art temporal contact segmentation results and competitive pixel-level segmentation results without any need for manual contacted object segmentation annotations. Furthermore we demonstrate that action representation learning with FEEL improves transfer performance on action understanding tasks without any manual labels over EPIC-Kitchens, SomethingSomething-V2, EgoExo4D and Meccano.

3.2ROOct 4, 2025

EmbodiSwap for Zero-Shot Robot Imitation Learning

Eadom Dessalene, Pavan Mantripragada, Michael Maynord et al.

We introduce EmbodiSwap - a method for producing photorealistic synthetic robot overlays over human video. We employ EmbodiSwap for zero-shot imitation learning, bridging the embodiment gap between in-the-wild ego-centric human video and a target robot embodiment. We train a closed-loop robot manipulation policy over the data produced by EmbodiSwap. We make novel use of V-JEPA as a visual backbone, repurposing V-JEPA from the domain of video understanding to imitation learning over synthetic robot videos. Adoption of V-JEPA outperforms alternative vision backbones more conventionally used within robotics. In real-world tests, our zero-shot trained V-JEPA model achieves an $82\%$ success rate, outperforming a few-shot trained $π_0$ network as well as $π_0$ trained over data produced by EmbodiSwap. We release (i) code for generating the synthetic robot overlays which takes as input human videos and an arbitrary robot URDF and generates a robot dataset, (ii) the robot dataset we synthesize over EPIC-Kitchens, HOI4D and Ego4D, and (iii) model checkpoints and inference code, to facilitate reproducible research and broader adoption.

2.2RONov 2, 2020Code

Grasping in the Dark: Zero-Shot Object Grasping Using Tactile Feedback

Kanishka Ganguly, Behzad Sadrfaridpour, Pavan Mantripragada et al.

Grasping and manipulating a wide variety of objects is a fundamental skill that would determine the success and wide spread adaptation of robots in homes. Several end-effector designs for robust manipulation have been proposed but they mostly work when provided with prior information about the objects or equipped with external sensors for estimating object shape or size. Such approaches are limited to many-shot or unknown objects and are prone to estimation errors from external estimation systems. We propose an approach to grasp and manipulate previously unseen or zero-shot objects: the objects without any prior of their shape, size, material and weight properties, using only feedback from tactile sensors which is contrary to the state-of-the-art. Such an approach provides robust manipulation of objects either when the object model is not known or when it is estimated incorrectly from an external system. Our approach is inspired by the ideology of how animals or humans manipulate objects, i.e., by using feedback from their skin. Our grasping and manipulation revolves around the simple notion that objects slip if not grasped stably. This slippage can be detected and counteracted for a robust grasp that is agnostic to the type, shape, size, material and weight of the object. At the crux of our approach is a novel tactile feedback based controller that detects and compensates for slip during grasp. We successfully evaluate and demonstrate our proposed approach on many real world experiments using the Shadow Dexterous Hand equipped with BioTac SP tactile sensors for different object shapes, sizes, weights and materials. We obtain an overall success rate of 73.5%