CVMay 7, 2024

Tactile-Augmented Radiance Fields

arXiv:2405.04534v140 citationsh-index: 25CVPR
AI Analysis

This work addresses the challenge of multimodal perception for robotics or human-computer interaction by combining vision and touch, though it appears incremental as it builds on existing neural radiance fields and diffusion models.

The paper tackles the problem of integrating vision and touch in a shared 3D scene representation by introducing a tactile-augmented radiance field (TaRF), which estimates visual and tactile signals from photos and sparse touch probes, and demonstrates accuracy in cross-modal generation and utility in downstream tasks.

We present a scene representation, which we call a tactile-augmented radiance field (TaRF), that brings vision and touch into a shared 3D space. This representation can be used to estimate the visual and tactile signals for a given 3D position within a scene. We capture a scene's TaRF from a collection of photos and sparsely sampled touch probes. Our approach makes use of two insights: (i) common vision-based touch sensors are built on ordinary cameras and thus can be registered to images using methods from multi-view geometry, and (ii) visually and structurally similar regions of a scene share the same tactile features. We use these insights to register touch signals to a captured visual scene, and to train a conditional diffusion model that, provided with an RGB-D image rendered from a neural radiance field, generates its corresponding tactile signal. To evaluate our approach, we collect a dataset of TaRFs. This dataset contains more touch samples than previous real-world datasets, and it provides spatially aligned visual signals for each captured touch signal. We demonstrate the accuracy of our cross-modal generative model and the utility of the captured visual-tactile data on several downstream tasks. Project page: https://dou-yiming.github.io/TaRF

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes