CVMay 20, 2025

RA-Touch: Retrieval-Augmented Touch Understanding with Enriched Visual Data

arXiv:2505.14270v12 citationsh-index: 2Has CodeMM
Originality Incremental advance
AI Analysis

This work addresses the challenge of costly tactile data collection for understanding object properties like texture and softness, offering a novel approach that is incremental in improving visuo-tactile perception.

The paper tackles the problem of visuo-tactile perception by introducing RA-Touch, a retrieval-augmented framework that leverages enriched visual data with tactile semantics, and it outperforms prior methods on the TVL benchmark.

Visuo-tactile perception aims to understand an object's tactile properties, such as texture, softness, and rigidity. However, the field remains underexplored because collecting tactile data is costly and labor-intensive. We observe that visually distinct objects can exhibit similar surface textures or material properties. For example, a leather sofa and a leather jacket have different appearances but share similar tactile properties. This implies that tactile understanding can be guided by material cues in visual data, even without direct tactile supervision. In this paper, we introduce RA-Touch, a retrieval-augmented framework that improves visuo-tactile perception by leveraging visual data enriched with tactile semantics. We carefully recaption a large-scale visual dataset with tactile-focused descriptions, enabling the model to access tactile semantics typically absent from conventional visual datasets. A key challenge remains in effectively utilizing these tactile-aware external descriptions. RA-Touch addresses this by retrieving visual-textual representations aligned with tactile inputs and integrating them to focus on relevant textural and material properties. By outperforming prior methods on the TVL benchmark, our method demonstrates the potential of retrieval-based visual reuse for tactile understanding. Code is available at https://aim-skku.github.io/RA-Touch

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes