CL IRNov 30, 2021

Towards Full-Fledged Argument Search: A Framework for Extracting and Clustering Arguments from Unstructured Text

arXiv:2112.00160v10.51 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for more comprehensive argument search tools in natural language processing, though it is incremental as it builds on existing methods to handle specific shortcomings.

The paper tackles the problem of argument search from unstructured text by proposing a framework that addresses argument-query matching, multi-sentence argument identification, and topic-aware argument clustering, achieving a macro F1 score of 0.71 for argument identification.

Argument search aims at identifying arguments in natural language texts. In the past, this task has been addressed by a combination of keyword search and argument identification on the sentence- or document-level. However, existing frameworks often address only specific components of argument search and do not address the following aspects: (1) argument-query matching: identifying arguments that frame the topic slightly differently than the actual search query; (2) argument identification: identifying arguments that consist of multiple sentences; (3) argument clustering: selecting retrieved arguments by topical aspects. In this paper, we propose a framework for addressing these shortcomings. We suggest (1) to combine the keyword search with precomputed topic clusters for argument-query matching, (2) to apply a novel approach based on sentence-level sequence-labeling for argument identification, and (3) to present aggregated arguments to users based on topic-aware argument clustering. Our experiments on several real-world debate data sets demonstrate that density-based clustering algorithms, such as HDBSCAN, are particularly suitable for argument-query matching. With our sentence-level, BiLSTM-based sequence-labeling approach we achieve a macro F1 score of 0.71. Finally, evaluating our argument clustering method indicates that a fine-grained clustering of arguments by subtopics remains challenging but is worthwhile to be explored.

View on arXiv PDF Code

Similar