A Lemma Based Evaluator for Semitic Language Text Summarization Systems
This addresses the challenge of accurate evaluation for Semitic language text summarization, though it is incremental as it builds on existing ROUGE methods.
The paper tackled the problem of evaluating text summarization systems for highly inflected languages like Arabic by developing a lemma-based matching strategy as an extension of ROUGE, which enhanced similarity detection between sentences with the same semantics but different lexical forms.
Matching texts in highly inflected languages such as Arabic by simple stemming strategy is unlikely to perform well. In this paper, we present a strategy for automatic text matching technique for for inflectional languages, using Arabic as the test case. The system is an extension of ROUGE test in which texts are matched on token's lemma level. The experimental results show an enhancement of detecting similarities between different sentences having same semantics but written in different lexical forms..