CLApr 15, 2021

TorontoCL at CMCL 2021 Shared Task: RoBERTa with Multi-Stage Fine-Tuning for Eye-Tracking Prediction

arXiv:2104.07244v1726 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem for computational linguistics researchers by providing an incremental improvement in eye-tracking prediction.

The paper tackled predicting human reading patterns from eye-tracking data using a RoBERTa-based model with multi-stage fine-tuning, achieving a MAE score of 3.929 and ranking 3rd out of 13 teams in the CMCL 2021 shared task.

Eye movement data during reading is a useful source of information for understanding language comprehension processes. In this paper, we describe our submission to the CMCL 2021 shared task on predicting human reading patterns. Our model uses RoBERTa with a regression layer to predict 5 eye-tracking features. We train the model in two stages: we first fine-tune on the Provo corpus (another eye-tracking dataset), then fine-tune on the task data. We compare different Transformer models and apply ensembling methods to improve the performance. Our final submission achieves a MAE score of 3.929, ranking 3rd place out of 13 teams that participated in this shared task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes