Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces From Images
This addresses a critical need for object manipulation tasks like grasping and pushing, but it is incremental as it builds on existing cross-modal learning approaches.
The paper tackles the problem of estimating tactile physical properties from visual information by building a model that learns this mapping, using a new dataset with over 400 multiview image sequences and achieving estimation across fifteen properties such as friction and compliance.
The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we develop a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property.