Towards Generalization of Tactile Image Generation: Reference-Free Evaluation in a Leakage-Free Setting
This addresses the challenge of ensuring robust generalization in tactile sensing for applications like robotics and multimodal learning, though it is incremental in improving evaluation methods.
The paper tackled the problem of inflated performance metrics in tactile image generation due to overlapping training and test samples, proposing a leakage-free evaluation protocol and reference-free metrics, and achieved superior performance and enhanced generalization on two datasets.
Tactile sensing, which relies on direct physical contact, is critical for human perception and underpins applications in computer vision, robotics, and multimodal learning. Because tactile data is often scarce and costly to acquire, generating synthetic tactile images provides a scalable solution to augment real-world measurements. However, ensuring robust generalization in synthesizing tactile images-capturing subtle, material-specific contact features-remains challenging. We demonstrate that overlapping training and test samples in commonly used datasets inflate performance metrics, obscuring the true generalizability of tactile models. To address this, we propose a leakage-free evaluation protocol coupled with novel, reference-free metrics-TMMD, I-TMMD, CI-TMMD, and D-TMMD-tailored for tactile generation. Moreover, we propose a vision-to-touch generation method that leverages text as an intermediate modality by incorporating concise, material-specific descriptions during training to better capture essential tactile features. Experiments on two popular visuo-tactile datasets, Touch and Go and HCT, show that our approach achieves superior performance and enhanced generalization in a leakage-free setting.