AI-driven visual monitoring of industrial assembly tasks
This addresses the need for flexible and reliable monitoring to prevent equipment damage and ensure worker safety in industrial settings, representing an incremental improvement over existing commercial solutions.
The paper tackles the problem of visual monitoring in industrial assembly tasks by introducing ViMAT, an AI-driven system that operates without rigid setups or visual markers, achieving effective real-time monitoring in challenging scenarios with partial and uncertain observations.
Visual monitoring of industrial assembly tasks is critical for preventing equipment damage due to procedural errors and ensuring worker safety. Although commercial solutions exist, they typically require rigid workspace setups or the application of visual markers to simplify the problem. We introduce ViMAT, a novel AI-driven system for real-time visual monitoring of assembly tasks that operates without these constraints. ViMAT combines a perception module that extracts visual observations from multi-view video streams with a reasoning module that infers the most likely action being performed based on the observed assembly state and prior task knowledge. We validate ViMAT on two assembly tasks, involving the replacement of LEGO components and the reconfiguration of hydraulic press molds, demonstrating its effectiveness through quantitative and qualitative analysis in challenging real-world scenarios characterized by partial and uncertain visual observations. Project page: https://tev-fbk.github.io/ViMAT