Workflow Augmentation of Video Data for Event Recognition with Time-Sensitive Neural Networks
This work addresses the challenge of rare event detection in medical video analysis, offering a domain-specific solution that is incremental but applicable to various use cases.
The paper tackled the problem of insufficient temporal augmentation for video data in supervised neural network training, particularly in medical contexts like cataract surgery, by introducing a workflow augmentation method that increased event alternation frequency by 26% and improved classification accuracy by 3% and precision by 7.8% compared to a state-of-the-art approach.
Supervised training of neural networks requires large, diverse and well annotated data sets. In the medical field, this is often difficult to achieve due to constraints in time, expert knowledge and prevalence of an event. Artificial data augmentation can help to prevent overfitting and improve the detection of rare events as well as overall performance. However, most augmentation techniques use purely spatial transformations, which are not sufficient for video data with temporal correlations. In this paper, we present a novel methodology for workflow augmentation and demonstrate its benefit for event recognition in cataract surgery. The proposed approach increases the frequency of event alternation by creating artificial videos. The original video is split into event segments and a workflow graph is extracted from the original annotations. Finally, the segments are assembled into new videos based on the workflow graph. Compared to the original videos, the frequency of event alternation in the augmented cataract surgery videos increased by 26%. Further, a 3% higher classification accuracy and a 7.8% higher precision was achieved compared to a state-of-the-art approach. Our approach is particularly helpful to increase the occurrence of rare but important events and can be applied to a large variety of use cases.