CLJul 14, 2024

Mitigating Translationese in Low-resource Languages: The Storyboard Approach

arXiv:2407.10152v181 citationsh-index: 32
Originality Incremental advance
AI Analysis

This addresses data quality issues for low-resource language communities, but it is incremental as it modifies existing data collection approaches.

The paper tackled the problem of translationese in low-resource languages by proposing a storyboard-based data collection method, which resulted in worse accuracy but better fluency compared to traditional translation methods.

Low-resource languages often face challenges in acquiring high-quality language data due to the reliance on translation-based methods, which can introduce the translationese effect. This phenomenon results in translated sentences that lack fluency and naturalness in the target language. In this paper, we propose a novel approach for data collection by leveraging storyboards to elicit more fluent and natural sentences. Our method involves presenting native speakers with visual stimuli in the form of storyboards and collecting their descriptions without direct exposure to the source text. We conducted a comprehensive evaluation comparing our storyboard-based approach with traditional text translation-based methods in terms of accuracy and fluency. Human annotators and quantitative metrics were used to assess translation quality. The results indicate a preference for text translation in terms of accuracy, while our method demonstrates worse accuracy but better fluency in the language focused.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes