Rhetorical relations for information retrieval
This work addresses the challenge of leveraging discourse structure for better search results in information retrieval, representing an incremental advancement in the field.
The paper tackled the problem of improving information retrieval by incorporating rhetorical relations into language models, finding that certain relations can enhance retrieval effectiveness by over 10% in mean average precision compared to a state-of-the-art baseline.
Typically, every part in most coherent text has some plausible reason for its presence, some function that it performs to the overall semantics of the text. Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts of a text are linked to each other. Knowledge about this socalled discourse structure has been applied successfully to several natural language processing tasks. This work studies the use of rhetorical relations for Information Retrieval (IR): Is there a correlation between certain rhetorical relations and retrieval performance? Can knowledge about a document's rhetorical relations be useful to IR? We present a language model modification that considers rhetorical relations when estimating the relevance of a document to a query. Empirical evaluation of different versions of our model on TREC settings shows that certain rhetorical relations can benefit retrieval effectiveness notably (> 10% in mean average precision over a state-of-the-art baseline).