CLLGNov 10, 2020

UmBERTo-MTSA @ AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations

arXiv:2011.05197v11 citations
AI Analysis

This work addresses data scarcity in language modeling for a specific shared task, but it is incremental as it builds on existing self-supervised and multi-task learning methods.

The paper tackled the problem of limited labeled data by proposing a self-supervised data augmentation approach using multi-task learning, which improved prediction quality in the AcCompl-it shared task at EVALITA 2020, though no concrete numbers were provided.

This work describes a self-supervised data augmentation approach used to improve learning models' performances when only a moderate amount of labeled data is available. Multiple copies of the original model are initially trained on the downstream task. Their predictions are then used to annotate a large set of unlabeled examples. Finally, multi-task training is performed on the parallel annotations of the resulting training set, and final scores are obtained by averaging annotator-specific head predictions. Neural language models are fine-tuned using this procedure in the context of the AcCompl-it shared task at EVALITA 2020, obtaining considerable improvements in prediction quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes