English-Catalan Neural Machine Translation in the Biomedical Domain through the cascade approach
This work addresses a low-resource translation problem for biomedical researchers and practitioners in Catalan-speaking regions, but it is incremental as it applies existing cascade methods to a new language pair and domain.
The paper tackled building a neural machine translation system for English-Catalan in the biomedical domain, a low-resource task, using a cascade pivot strategy through Spanish, and created a new test dataset for evaluation.
This paper describes the methodology followed to build a neural machine translation system in the biomedical domain for the English-Catalan language pair. This task can be considered a low-resourced task from the point of view of the domain and the language pair. To face this task, this paper reports experiments on a cascade pivot strategy through Spanish for the neural machine translation using the English-Spanish SCIELO and Spanish-Catalan El Periódico database. To test the final performance of the system, we have created a new test data set for English-Catalan in the biomedical domain which is freely available on request.