CLMay 13, 2022

Joint Generation of Captions and Subtitles with Dual Decoding

arXiv:2205.06522v1639 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses the need for efficient and consistent automatic captioning and subtitling to reduce post-production costs for media producers, though it appears incremental.

The paper tackled the problem of generating consistent captions and subtitles for audio-visual content by proposing a dual decoding scheme, resulting in improved adequacy and consistency without increasing model size or training complexity.

As the amount of audio-visual content increases, the need to develop automatic captioning and subtitling solutions to match the expectations of a growing international audience appears as the only viable way to boost throughput and lower the related post-production costs. Automatic captioning and subtitling often need to be tightly intertwined to achieve an appropriate level of consistency and synchronization with each other and with the video signal. In this work, we assess a dual decoding scheme to achieve a strong coupling between these two tasks and show how adequacy and consistency are increased, with virtually no additional cost in terms of model size and training complexity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes