RO CV LGJul 24, 2022

Pose Forecasting in Industrial Human-Robot Collaboration

Alessio Sampieri, Guido D'Amely, Andrea Avogaro, Federico Cunico, Geri Skenderi, Francesco Setti, Marco Cristani, Fabio Galasso

arXiv:2208.07308v119.446 citationsh-index: 43Has Code

Originality Incremental advance

AI Analysis

This work addresses safety and efficiency in industrial settings by enabling collaborative robots to predict human poses and detect collisions, though it is incremental with a novel method for a known bottleneck.

The paper tackles human pose forecasting for industrial human-robot collaboration by proposing a Separable-Sparse Graph Convolutional Network (SeS-GCN), which reduces parameters by 98.28% and speeds up inference by ~4 times while maintaining comparable accuracy on Human3.6M. It also introduces a new benchmark dataset (CHICO) and demonstrates SeS-GCN's performance with an average error of 85.3 mm for pose forecasting and an F1-score of 0.64 for collision detection.

Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting. For the first time, SeS-GCN bottlenecks the interaction of the spatial, temporal and channel-wise dimensions in GCNs, and it learns sparse adjacency matrices by a teacher-student framework. Compared to the state-of-the-art, it only uses 1.72% of the parameters and it is ~4 times faster, while still performing comparably in forecasting accuracy on Human3.6M at 1 second in the future, which enables cobots to be aware of human operators. As a second contribution, we present a new benchmark of Cobots and Humans in Industrial COllaboration (CHICO). CHICO includes multi-view videos, 3D poses and trajectories of 20 human operators and cobots, engaging in 7 realistic industrial actions. Additionally, it reports 226 genuine collisions, taking place during the human-cobot interaction. We test SeS-GCN on CHICO for two important perception tasks in robotics: human pose forecasting, where it reaches an average error of 85.3 mm (MPJPE) at 1 sec in the future with a run time of 2.3 msec, and collision detection, by comparing the forecasted human motion with the known cobot motion, obtaining an F1-score of 0.64.

View on arXiv PDF Code

Similar