CLJul 24, 2018

Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!

Steffen Eger, Johannes Daxenberger, Christian Stab, Iryna Gurevych

arXiv:1807.08998v132.11096 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of cross-lingual argumentation mining for researchers and practitioners, but it is incremental as it builds on existing methods with new data and comparisons.

The paper tackled the lack of suitable resources for cross-lingual argumentation mining by creating parallel corpora through translation of a persuasive essay dataset into multiple languages, and found that annotation projection nearly eliminates cross-lingual transfer loss, performing equally well with human or machine translations.

Argumentation mining (AM) requires the identification of complex discourse structures and has lately been applied with success monolingually. In this work, we show that the existing resources are, however, not adequate for assessing cross-lingual AM, due to their heterogeneity or lack of complexity. We therefore create suitable parallel corpora by (human and machine) translating a popular AM dataset consisting of persuasive student essays into German, French, Spanish, and Chinese. We then compare (i) annotation projection and (ii) bilingual word embeddings based direct transfer strategies for cross-lingual AM, finding that the former performs considerably better and almost eliminates the loss from cross-lingual transfer. Moreover, we find that annotation projection works equally well when using either costly human or cheap machine translations. Our code and data are available at \url{http://github.com/UKPLab/coling2018-xling_argument_mining}.

View on arXiv PDF Code

Similar