ROCVMar 14, 2024

Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting

arXiv:2403.09875v324 citationsIROS
Originality Incremental advance
AI Analysis

This addresses the challenge of accurate 3D representation in robotics for manipulation, especially with difficult materials, though it is an incremental advancement combining existing sensors and methods.

The paper tackles the problem of improving 3D scene reconstruction from few views by supervising 3D Gaussian Splatting with optical tactile sensor data, resulting in quantitatively and qualitatively better performance on opaque, reflective, and transparent objects compared to using vision or touch alone.

In this work, we propose a novel method to supervise 3D Gaussian Splatting (3DGS) scenes using optical tactile sensors. Optical tactile sensors have become widespread in their use in robotics for manipulation and object representation; however, raw optical tactile sensor data is unsuitable to directly supervise a 3DGS scene. Our representation leverages a Gaussian Process Implicit Surface to implicitly represent the object, combining many touches into a unified representation with uncertainty. We merge this model with a monocular depth estimation network, which is aligned in a two stage process, coarsely aligning with a depth camera and then finely adjusting to match our touch data. For every training image, our method produces a corresponding fused depth and uncertainty map. Utilizing this additional information, we propose a new loss function, variance weighted depth supervised loss, for training the 3DGS scene model. We leverage the DenseTact optical tactile sensor and RealSense RGB-D camera to show that combining touch and vision in this manner leads to quantitatively and qualitatively better results than vision or touch alone in a few-view scene syntheses on opaque as well as on reflective and transparent objects. Please see our project page at http://armlabstanford.github.io/touch-gs

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes