CLMay 28

Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study

arXiv:2605.3046524.2h-index: 3
AI Analysis

This research provides insights into zero-shot multi-label topic classification for practitioners working with LLMs, particularly regarding the utility of knowledge graph augmentation and self-consistency decoding, showing that larger models may already possess sufficient relational information.

This paper tackles zero-shot multi-label topic classification, especially for documents with complex relational information, by investigating the impact of per-article knowledge graph augmentation. The study found that keyword-enhanced classification (AK) was the best performing base method, with 6 out of 15 LLMs outperforming a sentence-encoder baseline. Knowledge graph augmentation positively impacted smaller models but negatively affected larger models, and self-consistency decoding did not improve performance.

Multi-label topic classification without labeled training data is a challenging task, specially when documents contain complex relational information. We present a zero-shot multi-label topic classification framework and systematically investigate how per-article knowledge graph augmentation affects its performance. The base framework classifies topics in documents without labeled training data and has four variants: article-only classification, keyword-enhanced classification, and self-consistency decoding variants of both. Then, we augment each base variant with per article knowledge graph. This graph is extracted from the input document through a pipeline similar to KGGen based on subject-predicate-object triples. We test all eight methods, four base and four graph augmented on fifteen LLMs and eight multi-label datasets across different domains. For the base framework, keyword-enhanced classification (AK) is the best performing method, and six out of fifteen LLMs surpass the sentence-encoder baseline. Graph augmentation has positive and negative impacts on small and large models, respectively. This shows that larger models already contain enough relational information from pretraining. Furthermore, the self-consistency decoding variant does not show performance improvements in any experiment while increasing computation costs about fivefold.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes