Automatic Parallel Corpus Creation for Hindi-English News Translation Task
This work addresses a domain-specific problem for NLP researchers and developers in Hindi-English news translation, but it is incremental as it builds on existing corpus generation methods.
The authors tackled the limited size of Hindi-English parallel corpora for news translation by developing an automatic generation system prototype, achieving interesting results as verified through various performance metrics.
The parallel corpus for multilingual NLP tasks, deep learning applications like Statistical Machine Translation Systems is very important. The parallel corpus of Hindi-English language pair available for news translation task till date is of very limited size as per the requirement of the systems are concerned. In this work we have developed an automatic parallel corpus generation system prototype, which creates Hindi-English parallel corpus for news translation task. Further to verify the quality of generated parallel corpus we have experimented by taking various performance metrics and the results are quite interesting.