Quality Evaluation of the Low-Resource Synthetically Generated Code-Mixed Hinglish Text
This work addresses the challenge of quality assessment for low-resource code-mixed text generation, which is incremental as it focuses on evaluation rather than novel generation methods.
The paper tackles the problem of evaluating the quality of synthetically generated code-mixed Hinglish text by proposing two subtasks: quality rating prediction and annotators' disagreement prediction, using human annotations to assess generation quality from two distinct approaches.
In this shared task, we seek the participating teams to investigate the factors influencing the quality of the code-mixed text generation systems. We synthetically generate code-mixed Hinglish sentences using two distinct approaches and employ human annotators to rate the generation quality. We propose two subtasks, quality rating prediction and annotators' disagreement prediction of the synthetic Hinglish dataset. The proposed subtasks will put forward the reasoning and explanation of the factors influencing the quality and human perception of the code-mixed text.