Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study
This research provides insights into zero-shot multi-label topic classification for practitioners working with LLMs, particularly regarding the utility of knowledge graph augmentation and self-consistency decoding, showing that larger models may already possess sufficient relational information.
This paper tackles zero-shot multi-label topic classification, especially for documents with complex relational information, by investigating the impact of per-article knowledge graph augmentation. The study found that keyword-enhanced classification (AK) was the best performing base method, with 6 out of 15 LLMs outperforming a sentence-encoder baseline. Knowledge graph augmentation positively impacted smaller models but negatively affected larger models, and self-consistency decoding did not improve performance.
Multi-label topic classification without labeled training data is a challenging task, specially when documents contain complex relational information. We present a zero-shot multi-label topic classification framework and systematically investigate how per-article knowledge graph augmentation affects its performance. The base framework classifies topics in documents without labeled training data and has four variants: article-only classification, keyword-enhanced classification, and self-consistency decoding variants of both. Then, we augment each base variant with per article knowledge graph. This graph is extracted from the input document through a pipeline similar to KGGen based on subject-predicate-object triples. We test all eight methods, four base and four graph augmented on fifteen LLMs and eight multi-label datasets across different domains. For the base framework, keyword-enhanced classification (AK) is the best performing method, and six out of fifteen LLMs surpass the sentence-encoder baseline. Graph augmentation has positive and negative impacts on small and large models, respectively. This shows that larger models already contain enough relational information from pretraining. Furthermore, the self-consistency decoding variant does not show performance improvements in any experiment while increasing computation costs about fivefold.