T. El-Shishtawy

CLJun 4, 2014

The Best Templates Match Technique For Example Based Machine Translation

T. El-Shishtawy, A. El-Sammak

It has been proved that large scale realistic Knowledge Based Machine Translation applications require acquisition of huge knowledge about language and about the world. This knowledge is encoded in computational grammars, lexicons and domain models. Another approach which avoids the need for collecting and analyzing massive knowledge, is the Example Based approach, which is the topic of this paper. We show through the paper that using Example Based in its native form is not suitable for translating into Arabic. Therefore a modification to the basic approach is presented to improve the accuracy of the translation process. The basic idea of the new approach is to improve the technique by which template-based approaches select the appropriate templates.

CLSep 22, 2013

A Hybrid Algorithm for Matching Arabic Names

T. El-Shishtawy

In this paper, a new hybrid algorithm which combines both of token-based and character-based approaches is presented. The basic Levenshtein approach has been extended to token-based distance metric. The distance metric is enhanced to set the proper granularity level behavior of the algorithm. It smoothly maps a threshold of misspellings differences at the character level, and the importance of token level errors in terms of token's position and frequency. Using a large Arabic dataset, the experimental results show that the proposed algorithm overcomes successfully many types of errors such as: typographical errors, omission or insertion of middle name components, omission of non-significant popular name components, and different writing styles character variations. When compared the results with other classical algorithms, using the same dataset, the proposed algorithm was found to increase the minimum success level of best tested algorithms, while achieving higher upper limits .

T. El-Shishtawy

2 Papers