CL AISep 27, 2025

Trainable Reference-Based Evaluation Metric for Identifying Quality of English-Gujarati Machine Translation System

Nisheeth Joshi, Pragya Katyayan, Palak Arora

arXiv:2510.05113v1h-index: 5AIP Conf Proc

Originality Synthesis-oriented

AI Analysis

This addresses the need for effective evaluation metrics in Indian languages like Gujarati, where standard methods for European languages fail, representing a domain-specific incremental improvement.

The paper tackles the problem of evaluating machine translation for English-Gujarati by introducing a trainable reference-based metric, which achieved better human correlation compared to existing metrics.

Machine Translation (MT) Evaluation is an integral part of the MT development life cycle. Without analyzing the outputs of MT engines, it is impossible to evaluate the performance of an MT system. Through experiments, it has been identified that what works for English and other European languages does not work well with Indian languages. Thus, In this paper, we have introduced a reference-based MT evaluation metric for Gujarati which is based on supervised learning. We have trained two versions of the metric which uses 25 features for training. Among the two models, one model is trained using 6 hidden layers with 500 epochs while the other model is trained using 10 hidden layers with 500 epochs. To test the performance of the metric, we collected 1000 MT outputs of seven MT systems. These MT engine outputs were compared with 1 human reference translation. While comparing the developed metrics with other available metrics, it was found that the metrics produced better human correlations.

View on arXiv PDF

Similar