An Arabic Dependency Treebank in the Travel Domain
This work provides a specialized dataset for Arabic NLP researchers, but it is incremental as it adapts existing methods to a new domain.
The authors tackled the lack of domain-specific resources for Arabic NLP by creating a dependency treebank for travel domain sentences in Modern Standard Arabic, derived from translated English sentences, and they presented parsing results to analyze domain effects.
In this paper we present a dependency treebank of travel domain sentences in Modern Standard Arabic. The text comes from a translation of the English equivalent sentences in the Basic Traveling Expressions Corpus. The treebank dependency representation is in the style of the Columbia Arabic Treebank. The paper motivates the effort and discusses the construction process and guidelines. We also present parsing results and discuss the effect of domain and genre difference on parsing.