CLFeb 2, 2021

Two Demonstrations of the Machine Translation Applications to Historical Documents

arXiv:2102.01417v10.2Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of making historical documents more accessible and readable for researchers and the general public by modernizing language and orthography.

This paper demonstrates two machine translation applications for historical documents: one translates documents into a modern version of their original language, and the other modernizes document orthography. The system uses an interactive, adaptive framework where user corrections lead to new hypotheses and online model adaptation.

We present our demonstration of two machine translation applications to historical documents. The first task consists in generating a new version of a historical document, written in the modern version of its original language. The second application is limited to a document's orthography. It adapts the document's spelling to modern standards in order to achieve an orthography consistency and accounting for the lack of spelling conventions. We followed an interactive, adaptive framework that allows the user to introduce corrections to the system's hypothesis. The system reacts to these corrections by generating a new hypothesis that takes them into account. Once the user is satisfied with the system's hypothesis and validates it, the system adapts its model following an online learning strategy. This system is implemented following a client-server architecture. We developed a website which communicates with the neural models. All code is open-source and publicly available. The demonstration is hosted at http://demosmt.prhlt.upv.es/mthd/.

View on arXiv PDF Code

Similar