Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022
This work addresses video understanding challenges for egocentric AI applications, but it is incremental as it applies an existing method to new tasks.
The authors tackled object state change classification and PNR temporal localization in egocentric videos for the Ego4D Challenge 2022, achieving second place in both tasks.
In this report, we present our approach and empirical results of applying masked autoencoders in two egocentric video understanding tasks, namely, Object State Change Classification and PNR Temporal Localization, of Ego4D Challenge 2022. As team TheSSVL, we ranked 2nd place in both tasks. Our code will be made available.