CLJan 16, 2025

A Simple Graph Contrastive Learning Framework for Short Text Classification

arXiv:2501.09219v15 citationsh-index: 18AAAI
Originality Incremental advance
AI Analysis

This work addresses semantic sparsity and limited labeled data in short text classification, offering a novel approach that improves accuracy for applications like social media analysis, though it is incremental in the context of existing graph and contrastive learning methods.

The paper tackled the problem of short text classification by proposing a simple graph contrastive learning framework that eliminates explicit data augmentation, addressing semantic corruption and noise while leveraging multi-view embeddings. The result is outstanding performance, surpassing large language models on various datasets.

Short text classification has gained significant attention in the information age due to its prevalence and real-world applications. Recent advancements in graph learning combined with contrastive learning have shown promising results in addressing the challenges of semantic sparsity and limited labeled data in short text classification. However, existing models have certain limitations. They rely on explicit data augmentation techniques to generate contrastive views, resulting in semantic corruption and noise. Additionally, these models only focus on learning the intrinsic consistency between the generated views, neglecting valuable discriminative information from other potential views. To address these issues, we propose a Simple graph contrastive learning framework for Short Text Classification (SimSTC). Our approach involves performing graph learning on multiple text-related component graphs to obtain multi-view text embeddings. Subsequently, we directly apply contrastive learning on these embeddings. Notably, our method eliminates the need for data augmentation operations to generate contrastive views while still leveraging the benefits of multi-view contrastive learning. Despite its simplicity, our model achieves outstanding performance, surpassing large language models on various datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes