CLJul 11, 2024

System Report for CCL24-Eval Task 7: Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation

Jingshen Zhang, Xiangyu Yang, Xinkai Su, Xinglu Chen, Tianyou Huang, Xinying Qiu

arXiv:2407.08206v11.0h-index: 1

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of automated essay evaluation for Chinese learners, representing an incremental improvement in a domain-specific competition setting.

The paper tackled the Chinese Essay Fluency Evaluation task by developing methods for multi-error modeling and fluency-targeted pre-training, achieving first place in Track 3 of the CCL-2024 competition.

This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types per sentence. For Track 3, where we achieved first place, we generated fluency-rated pseudo-data via back-translation for pre-training and used an NSP-based strategy with Symmetric Cross Entropy loss to capture context and mitigate long dependencies. Our methods effectively address key challenges in Chinese Essay Fluency Evaluation.

View on arXiv PDF

Similar