Sequential Image-based Attention Network for Inferring Force Estimation without Haptic Sensor
This work addresses the challenge of force estimation without haptic sensors, which could benefit robotics and human-computer interaction, but it appears incremental as it builds on existing deep learning methods with novel attention mechanisms.
The paper tackles the problem of estimating interaction forces between objects using only visual input, proposing a recurrent convolutional neural network with a sequential image-based attention module that achieved successful force inference under various conditions.
Humans can infer approximate interaction force between objects from only vision information because we already have learned it through experiences. Based on this idea, we propose a recurrent convolutional neural network-based method using sequential images for inferring interaction force without using a haptic sensor. For training and validating deep learning methods, we collected a large number of images and corresponding interaction forces through an electronic motor-based device. To concentrate on changing shapes of a target object by the external force in images, we propose a sequential image-based attention module, which learns a salient model from temporal dynamics. The proposed sequential image-based attention module consists of a sequential spatial attention module and a sequential channel attention module, which are extended to exploit multiple sequential images. For gaining better accuracy, we also created a weighted average pooling layer for both spatial and channel attention modules. The extensive experimental results verified that the proposed method successfully infers interaction forces under the various conditions, such as different target materials, illumination changes, and external force directions.