Improving Portuguese Semantic Role Labeling with Transformers and Transfer Learning
This work provides a significant improvement for researchers and developers working on natural language processing tasks in Portuguese, particularly for semantic understanding.
This paper addresses the challenge of Semantic Role Labeling (SRL) in Portuguese, a low-resource language, by employing a Transformer-based model architecture. The authors achieved a substantial improvement of over 15 F1 points in Portuguese SRL performance.
The Natural Language Processing task of determining "Who did what to whom" is called Semantic Role Labeling. For English, recent methods based on Transformer models have allowed for major improvements in this task over the previous state of the art. However, for low resource languages, like Portuguese, currently available semantic role labeling models are hindered by scarce training data. In this paper, we explore a model architecture with only a pre-trained Transformer-based model, a linear layer, softmax and Viterbi decoding. We substantially improve the state-of-the-art performance in Portuguese by over 15 F1. Additionally, we improve semantic role labeling results in Portuguese corpora by exploiting cross-lingual transfer learning using multilingual pre-trained models, and transfer learning from dependency parsing in Portuguese, evaluating the various proposed approaches empirically.