IR CLNov 3, 2023

Plot Retrieval as an Assessment of Abstract Semantic Association

Shicheng Xu, Liang Pang, Jiangnan Li, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou

arXiv:2311.01666v115.914 citationsh-index: 45

Originality Synthesis-oriented

AI Analysis

This addresses the problem of improving reading efficiency for users by enabling better plot retrieval from books, though it is incremental as it focuses on dataset creation and benchmarking rather than a new method.

The authors introduced Plot Retrieval, a dataset designed to evaluate information retrieval models on their ability to capture abstract semantic associations between queries and book plots, rather than just lexical or semantic matching. Experiments showed that current IR models still struggle with this task, highlighting a gap in performance.

Retrieving relevant plots from the book for a query is a critical task, which can improve the reading experience and efficiency of readers. Readers usually only give an abstract and vague description as the query based on their own understanding, summaries, or speculations of the plot, which requires the retrieval model to have a strong ability to estimate the abstract semantic associations between the query and candidate plots. However, existing information retrieval (IR) datasets cannot reflect this ability well. In this paper, we propose Plot Retrieval, a labeled dataset to train and evaluate the performance of IR models on the novel task Plot Retrieval. Text pairs in Plot Retrieval have less word overlap and more abstract semantic association, which can reflect the ability of the IR models to estimate the abstract semantic association, rather than just traditional lexical or semantic matching. Extensive experiments across various lexical retrieval, sparse retrieval, dense retrieval, and cross-encoder methods compared with human studies on Plot Retrieval show current IR models still struggle in capturing abstract semantic association between texts. Plot Retrieval can be the benchmark for further research on the semantic association modeling ability of IR models.

View on arXiv PDF

Similar