Egocentric Video Task Translation @ Ego4D Challenge 2022
This work addresses performance enhancement in specific egocentric video tasks for the Ego4D challenge, but it is incremental as it builds on existing models without architectural changes.
The paper tackled the problem of improving egocentric video tasks by leveraging models from related tasks through a task translator, achieving 1st place in the talking to me challenge and 3rd in the PNR keyframe localization challenge in the Ego4D challenge 2022.
This technical report describes the EgoTask Translation approach that explores relations among a set of egocentric video tasks in the Ego4D challenge. To improve the primary task of interest, we propose to leverage existing models developed for other related tasks and design a task translator that learns to ''translate'' auxiliary task features to the primary task. With no modification to the baseline architectures, our proposed approach achieves competitive performance on two Ego4D challenges, ranking the 1st in the talking to me challenge and the 3rd in the PNR keyframe localization challenge.