CLJun 3, 2024

MACT: Model-Agnostic Cross-Lingual Training for Discourse Representation Structure Parsing

arXiv:2406.01052v13.44 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of semantic representation parsing for natural language understanding across multiple languages, though it appears incremental as it builds on existing cross-lingual methods.

The paper tackles the problem of Discourse Representation Structure (DRS) parsing models performing poorly with monolingual training by introducing a model-agnostic cross-lingual training strategy that leverages language alignments in pre-trained models, achieving state-of-the-art results on standard benchmarks for English, German, Italian, and Dutch.

Discourse Representation Structure (DRS) is an innovative semantic representation designed to capture the meaning of texts with arbitrary lengths across languages. The semantic representation parsing is essential for achieving natural language understanding through logical forms. Nevertheless, the performance of DRS parsing models remains constrained when trained exclusively on monolingual data. To tackle this issue, we introduce a cross-lingual training strategy. The proposed method is model-agnostic yet highly effective. It leverages cross-lingual training data and fully exploits the alignments between languages encoded in pre-trained language models. The experiments conducted on the standard benchmarks demonstrate that models trained using the cross-lingual training method exhibit significant improvements in DRS clause and graph parsing in English, German, Italian and Dutch. Comparing our final models to previous works, we achieve state-of-the-art results in the standard benchmarks. Furthermore, the detailed analysis provides deep insights into the performance of the parsers, offering inspiration for future research in DRS parsing. We keep updating new results on benchmarks to the appendix.

View on arXiv PDF Code

Similar