An Empirical Study on Measuring the Similarity of Sentential Arguments with Language Model Domain Adaptation
This work addresses the challenge of expensive labeled data in argument mining for researchers and practitioners, though it is incremental as it builds on existing transfer learning methods.
The paper tackled the problem of measuring similarity between sentential arguments by using transfer learning with domain adaptation and self-supervised learning, achieving improved correlation with human scores in an unsupervised setting and comparable performance to supervised baselines with only about 60% of labeled data.
Measuring the similarity between two different sentential arguments is an important task in argument mining. However, one of the challenges in this field is that the dataset must be annotated using expertise in a variety of topics, making supervised learning with labeled data expensive. In this paper, we investigated whether this problem could be alleviated through transfer learning. We first adapted a pretrained language model to a domain of interest using self-supervised learning. Then, we fine-tuned the model to a task of measuring the similarity between sentences taken from different domains. Our approach improves a correlation with human-annotated similarity scores compared to competitive baseline models on the Argument Facet Similarity dataset in an unsupervised setting. Moreover, we achieve comparable performance to a fully supervised baseline model by using only about 60% of the labeled data samples. We believe that our work suggests the possibility of a generalized argument clustering model for various argumentative topics.