CLMay 17, 2024

SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

arXiv:2405.10741v114.126 citationsh-index: 34Has CodeACL

Originality Incremental advance

AI Analysis

This addresses the need for more robust and accessible subtitling tools in audiovisual content, though it is incremental as it builds on prior transcription-free work for translation and segmentation.

The paper tackled the problem of generating automatic subtitles without relying on transcripts, particularly for timestamp prediction, and achieved new state-of-the-art performance across multiple language pairs and diverse conditions.

Subtitling plays a crucial role in enhancing the accessibility of audiovisual content and encompasses three primary subtasks: translating spoken dialogue, segmenting translations into concise textual units, and estimating timestamps that govern their on-screen duration. Past attempts to automate this process rely, to varying degrees, on automatic transcripts, employed diversely for the three subtasks. In response to the acknowledged limitations associated with this reliance on transcripts, recent research has shifted towards transcription-free solutions for translation and segmentation, leaving the direct generation of timestamps as uncharted territory. To fill this gap, we introduce the first direct model capable of producing automatic subtitles, entirely eliminating any dependence on intermediate transcripts also for timestamp prediction. Experimental results, backed by manual evaluation, showcase our solution's new state-of-the-art performance across multiple language pairs and diverse conditions.

View on arXiv PDF Code

Similar