CLIROct 17, 2022

Zero-Shot Ranking Socio-Political Texts with Transformer Language Models to Reduce Close Reading Time

arXiv:2210.09179v1290 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently extracting information from socio-political documents for researchers or analysts, though it appears incremental as it builds on existing zero-shot and ranking methods.

The paper tackles the problem of classifying socio-political texts by framing it as an entailment task and applying zero-shot ranking with Transformer language models, resulting in reduced close reading time, with DeBERTa outperforming RoBERTa and declarative queries yielding higher mean average precision scores.

We approach the classification problem as an entailment problem and apply zero-shot ranking to socio-political texts. Documents that are ranked at the top can be considered positively classified documents and this reduces the close reading time for the information extraction process. We use Transformer Language Models to get the entailment probabilities and investigate different types of queries. We find that DeBERTa achieves higher mean average precision scores than RoBERTa and when declarative form of the class label is used as a query, it outperforms dictionary definition of the class label. We show that one can reduce the close reading time by taking some percentage of the ranked documents that the percentage depends on how much recall they want to achieve. However, our findings also show that percentage of the documents that should be read increases as the topic gets broader.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes