CLJan 23, 2025

Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with an Optimized Transformer

Jia Gao, Guiran Liu, Binrong Zhu, Shicheng Zhou, Hongye Zheng, Xiaoxuan Liao

arXiv:2501.13467v110.912 citationsh-index: 102025 5th International Conference on Consumer Electronics and Computer Engineering (ICCECE)

Originality Incremental advance

AI Analysis

This work addresses text classification efficiency and performance for researchers and practitioners, but it is incremental as it builds upon existing Transformer methods.

The paper tackles the problem of capturing deep semantic relationships and reducing computational complexity in text classification by introducing a multi-level attention mechanism and contrastive learning strategy into an improved Transformer model, resulting in outperforming models like BiLSTM, CNN, standard Transformer, and BERT in accuracy, F1 score, and recall rate.

This paper studies a text classification algorithm based on an improved Transformer to improve the performance and efficiency of the model in text classification tasks. Aiming at the shortcomings of the traditional Transformer model in capturing deep semantic relationships and optimizing computational complexity, this paper introduces a multi-level attention mechanism and a contrastive learning strategy. The multi-level attention mechanism effectively models the global semantics and local features in the text by combining global attention with local attention; the contrastive learning strategy enhances the model's ability to distinguish between different categories by constructing positive and negative sample pairs while improving the classification effect. In addition, in order to improve the training and inference efficiency of the model on large-scale text data, this paper designs a lightweight module to optimize the feature transformation process and reduce the computational cost. Experimental results on the dataset show that the improved Transformer model outperforms the comparative models such as BiLSTM, CNN, standard Transformer, and BERT in terms of classification accuracy, F1 score, and recall rate, showing stronger semantic representation ability and generalization performance. The method proposed in this paper provides a new idea for algorithm optimization in the field of text classification and has good application potential and practical value. Future work will focus on studying the performance of this model in multi-category imbalanced datasets and cross-domain tasks and explore the integration wi

View on arXiv PDF

Similar