CVApr 2, 2024

PREGO: online mistake detection in PRocedural EGOcentric videos

arXiv:2404.01933v237 citationsh-index: 37CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of promptly identifying mistakes in procedural tasks for applications like manufacturing and healthcare, though it is incremental as it builds on existing action recognition and symbolic reasoning methods.

The paper tackles the problem of online detection of procedural errors in egocentric videos, proposing PREGO as the first model for this task, which achieves detection by comparing recognized current actions with expected future ones, and establishes benchmarks on adapted datasets.

Promptly identifying procedural errors from egocentric videos in an online setting is highly challenging and valuable for detecting mistakes as soon as they happen. This capability has a wide range of applications across various fields, such as manufacturing and healthcare. The nature of procedural mistakes is open-set since novel types of failures might occur, which calls for one-class classifiers trained on correctly executed procedures. However, no technique can currently detect open-set procedural mistakes online. We propose PREGO, the first online one-class classification model for mistake detection in PRocedural EGOcentric videos. PREGO is based on an online action recognition component to model the current action, and a symbolic reasoning module to predict the next actions. Mistake detection is performed by comparing the recognized current action with the expected future one. We evaluate PREGO on two procedural egocentric video datasets, Assembly101 and Epic-tent, which we adapt for online benchmarking of procedural mistake detection to establish suitable benchmarks, thus defining the Assembly101-O and Epic-tent-O datasets, respectively.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes