CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022
This work addresses the need for efficient real-time speech translation systems, presenting an incremental improvement by onlinizing existing offline models.
The paper tackled the problem of adapting offline speech translation models for simultaneous translation without modifying the original model, achieving results almost on par with offline performance while being 3 times faster in latency and outperforming the best previous system in medium and high latency regimes.
In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022. We explore strategies to utilize an offline model in a simultaneous setting without the need to modify the original model. In our experiments, we show that our onlinization algorithm is almost on par with the offline setting while being $3\times$ faster than offline in terms of latency on the test set. We also show that the onlinized offline model outperforms the best IWSLT2021 simultaneous system in medium and high latency regimes and is almost on par in the low latency regime. We make our system publicly available.