CLJul 24, 2021

MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation

arXiv:2107.11534v1661 citations
Originality Incremental advance
AI Analysis

This addresses the problem of reliable evaluation for code-mixed NLG tasks, which is crucial for researchers and practitioners working with multilingual applications, though it appears incremental as it builds on existing evaluation frameworks.

The paper tackles the challenge of evaluating natural language generation (NLG) tasks for code-mixed text, where existing metrics perform poorly, by proposing MIPE, a metric-independent pipeline that significantly improves correlation with human judgments, demonstrated on Hinglish sentences from the HinGE corpus.

Code-mixing is a phenomenon of mixing words and phrases from two or more languages in a single utterance of speech and text. Due to the high linguistic diversity, code-mixing presents several challenges in evaluating standard natural language generation (NLG) tasks. Various widely popular metrics perform poorly with the code-mixed NLG tasks. To address this challenge, we present a metric independent evaluation pipeline MIPE that significantly improves the correlation between evaluation metrics and human judgments on the generated code-mixed text. As a use case, we demonstrate the performance of MIPE on the machine-generated Hinglish (code-mixing of Hindi and English languages) sentences from the HinGE corpus. We can extend the proposed evaluation strategy to other code-mixed language pairs, NLG tasks, and evaluation metrics with minimal to no effort.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes