Jiajun Bao

4.9CLAug 11, 2025

Toward Machine Interpreting: Lessons from Human Interpreting Studies

Matthias Sperber, Maureen de Seyssel, Jiajun Bao et al.

Current speech translation systems, while having achieved impressive accuracies, are rather static in their behavior and do not adapt to real-world situations in ways human interpreters do. In order to improve their practical usefulness and enable interpreting-like experiences, a precise understanding of the nature of human interpreting is crucial. To this end, we discuss human interpreting literature from the perspective of the machine translation field, while considering both operational and qualitative aspects. We identify implications for the development of speech translation systems and argue that there is great potential to adopt many human interpreting principles using recent modeling techniques. We hope that our findings provide inspiration for closing the perceived usability gap, and can motivate progress toward true machine interpreting.

2.6CVFeb 16, 2022Code

When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs

Oana Ignat, Santiago Castro, Yuhang Zhou et al.

We consider the task of temporal human action localization in lifestyle vlogs. We introduce a novel dataset consisting of manual annotations of temporal localization for 13,000 narrated actions in 1,200 video clips. We present an extensive analysis of this data, which allows us to better understand how the language and visual modalities interact throughout the videos. We propose a simple yet effective method to localize the narrated actions based on their expected duration. Through several experiments and analyses, we show that our method brings complementary information with respect to previous methods, and leads to improvements over previous work for the task of temporal action localization.

Jiajun Bao

2 Papers