CVLGROJun 14, 2019

Connecting Touch and Vision via Cross-Modal Prediction

arXiv:1906.06322v1172 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of cross-modal perception between vision and touch, which is incremental as it builds on existing multi-modal research by introducing specific tasks and a dataset for this domain.

The paper tackles the problem of connecting vision and touch by synthesizing tactile signals from visual inputs and imagining visual interactions from tactile data, using a conditional adversarial model that incorporates scale and location information to address the scale discrepancy between the two modalities. Human perceptual studies show the model produces realistic cross-modal predictions, with qualitative and quantitative results validating its effectiveness.

Humans perceive the world using multi-modal sensory inputs such as vision, audition, and touch. In this work, we investigate the cross-modal connection between vision and touch. The main challenge in this cross-domain modeling task lies in the significant scale discrepancy between the two: while our eyes perceive an entire visual scene at once, humans can only feel a small region of an object at any given moment. To connect vision and touch, we introduce new tasks of synthesizing plausible tactile signals from visual inputs as well as imagining how we interact with objects given tactile data as input. To accomplish our goals, we first equip robots with both visual and tactile sensors and collect a large-scale dataset of corresponding vision and tactile image sequences. To close the scale gap, we present a new conditional adversarial model that incorporates the scale and location information of the touch. Human perceptual studies demonstrate that our model can produce realistic visual images from tactile data and vice versa. Finally, we present both qualitative and quantitative experimental results regarding different system designs, as well as visualizing the learned representations of our model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes