ROCVNov 11, 2020

Continuous Perception for Classifying Shapes and Weights of Garmentsfor Robotic Vision Applications

arXiv:2011.06089v2
AI Analysis

This work addresses a domain-specific problem in robotic vision for laundry automation, but it is incremental as it builds on existing neural network methods with moderate performance gains.

The paper tackles the problem of predicting garment shapes and weights from video sequences for robotic laundry tasks, achieving classification accuracies of 48% for shapes and 60% for weights using a modified AlexNet-LSTM architecture.

We present an approach to continuous perception for robotic laundry tasks. Our assumption is that the visual prediction of a garment's shapes and weights is possible via a neural network that learns the dynamic changes of garments from video sequences. Continuous perception is leveraged during training by inputting consecutive frames, of which the network learns how a garment deforms. To evaluate our hypothesis, we captured a dataset of 40K RGB and 40K depth video sequences while a garment is being manipulated. We also conducted ablation studies to understand whether the neural network learns the physical and dynamic properties of garments. Our findings suggest that a modified AlexNet-LSTM architecture has the best classification performance for the garment's shape and weights. To further provide evidence that continuous perception facilitates the prediction of the garment's shapes and weights, we evaluated our network on unseen video sequences and computed the 'Moving Average' over a sequence of predictions. We found that our network has a classification accuracy of 48% and 60% for shapes and weights of garments, respectively.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes