CL HC IROct 20, 2020

Enhancing Keyphrase Extraction from Microblogs using Human Reading Time

arXiv:2010.09934v28 citationsHas Code

AI Analysis

This work addresses the problem of extracting keyphrases from microblogs for applications like information retrieval, though it is incremental by building on existing methods with a new feature.

The authors tackled keyphrase extraction from microblog posts by incorporating human reading time, measured via eye fixation durations, into neural network models, resulting in improved performance over baselines on two datasets.

The premise of manual keyphrase annotation is to read the corresponding content of an annotated object. Intuitively, when we read, more important words will occupy a longer reading time. Hence, by leveraging human reading time, we can find the salient words in the corresponding content. However, previous studies on keyphrase extraction ignore human reading features. In this article, we aim to leverage human reading time to extract keyphrases from microblog posts. There are two main tasks in this study. One is to determine how to measure the time spent by a human on reading a word. We use eye fixation durations extracted from an open source eye-tracking corpus (OSEC). Moreover, we propose strategies to make eye fixation duration more effective on keyphrase extraction. The other task is to determine how to integrate human reading time into keyphrase extraction models. We propose two novel neural network models. The first is a model in which the human reading time is used as the ground truth of the attention mechanism. In the second model, we use human reading time as the external feature. Quantitative and qualitative experiments show that our proposed models yield better performance than the baseline models on two microblog datasets.

View on arXiv PDF

Similar