CVMar 14, 2024

Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines

arXiv:2403.09056v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses real-time hand action recognition for industrial assembly line supervision, but it appears incremental by applying existing foundation models and skeletal tracking to a specific domain.

The paper tackled the problem of insufficient and low-quality training datasets and real-time performance bottlenecks for hand action recognition in industrial assembly lines, achieving 98.8% accuracy in wire insertion scenarios.

On modern industrial assembly lines, many intelligent algorithms have been developed to replace or supervise workers. However, we found that there were bottlenecks in both training datasets and real-time performance when deploying algorithms on actual assembly line. Therefore, we developed a promising strategy for expanding industrial datasets, which utilized large models with strong generalization abilities to achieve efficient, high-quality, and large-scale dataset expansion, solving the problem of insufficient and low-quality industrial datasets. We also applied this strategy to video action recognition. We proposed a method of converting hand action recognition problems into hand skeletal trajectory classification problems, which solved the real-time performance problem of industrial algorithms. In the "hand movements during wire insertion" scenarios on the actual assembly line, the accuracy of hand action recognition reached 98.8\%. We conducted detailed experimental analysis to demonstrate the effectiveness and superiority of the method, and deployed the entire process on Midea's actual assembly line.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes