Multi-model learning by sequential reading of untrimmed videos for action recognition
This addresses action recognition in videos for computer vision applications, but appears incremental as it builds on existing multi-model and federated learning approaches.
The paper tackles action recognition in untrimmed videos by proposing a method that aggregates multiple models through sequential clip extraction and federated learning synchronization, resulting in improved performance compared to no synchronization.
We propose a new method for learning videos by aggregating multiple models by sequentially extracting video clips from untrimmed video. The proposed method reduces the correlation between clips by feeding clips to multiple models in turn and synchronizes these models through federated learning. Experimental results show that the proposed method improves the performance compared to the no synchronization.