Automatic Argumentative-Zoning Using Word2vec
This work addresses the problem of automating argumentative zoning for scientific paper analysis, offering an incremental improvement over traditional feature engineering methods.
The paper tackled the task of argumentative zoning in scientific papers by exploring sentence vector models using word2vec embeddings, finding that averaging word vectors outperformed paragraph-to-vector and integrating cuewords improved classification, with word2vec beating hand-crafted features in most categories.
In comparison with document summarization on the articles from social media and newswire, argumentative zoning (AZ) is an important task in scientific paper analysis. Traditional methodology to carry on this task relies on feature engineering from different levels. In this paper, three models of generating sentence vectors for the task of sentence classification were explored and compared. The proposed approach builds sentence representations using learned embeddings based on neural network. The learned word embeddings formed a feature space, to which the examined sentence is mapped to. Those features are input into the classifiers for supervised classification. Using 10-cross-validation scheme, evaluation was conducted on the Argumentative-Zoning (AZ) annotated articles. The results showed that simply averaging the word vectors in a sentence works better than the paragraph to vector algorithm and by integrating specific cuewords into the loss function of the neural network can improve the classification performance. In comparison with the hand-crafted features, the word2vec method won for most of the categories. However, the hand-crafted features showed their strength on classifying some of the categories.