CLSep 17, 2018

Categorizing Comparative Sentences

Alexander Panchenko, Alexander Bondarenko, Mirco Franzek, Matthias Hagen, Chris Biemann

arXiv:1809.06152v232.01094 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the need for extracting comparative sentences to support pro/con argumentation in search engines or debating technologies, but it is incremental as it applies existing methods to a new annotated dataset.

The paper tackled the problem of automatically identifying and categorizing comparative sentences, such as determining preferences like 'Python has better NLP libraries than MATLAB', by manually annotating 7,199 sentences and achieving an F1 score of 85% with a gradient boosting model based on pre-trained sentence embeddings.

We tackle the tasks of automatically identifying comparative sentences and categorizing the intended preference (e.g., "Python has better NLP libraries than MATLAB" => (Python, better, MATLAB). To this end, we manually annotate 7,199 sentences for 217 distinct target item pairs from several domains (27% of the sentences contain an oriented comparison in the sense of "better" or "worse"). A gradient boosting model based on pre-trained sentence embeddings reaches an F1 score of 85% in our experimental evaluation. The model can be used to extract comparative sentences for pro/con argumentation in comparative / argument search engines or debating technologies.

View on arXiv PDF Code

Similar