Jingpeng Zhao

2papers

2 Papers

CLApr 18, 2022
HFT-ONLSTM: Hierarchical and Fine-Tuning Multi-label Text Classification

Pengfei Gao, Jingpeng Zhao, Yinglong Ma et al.

Many important classification problems in the real-world consist of a large number of closely related categories in a hierarchical structure or taxonomy. Hierarchical multi-label text classification (HMTC) with higher accuracy over large sets of closely related categories organized in a hierarchy or taxonomy has become a challenging problem. In this paper, we present a hierarchical and fine-tuning approach based on the Ordered Neural LSTM neural network, abbreviated as HFT-ONLSTM, for more accurate level-by-level HMTC. First, we present a novel approach to learning the joint embeddings based on parent category labels and textual data for accurately capturing the joint features of both category labels and texts. Second, a fine tuning technique is adopted for training parameters such that the text classification results in the upper level should contribute to the classification in the lower one. At last, the comprehensive analysis is made based on extensive experiments in comparison with the state-of-the-art hierarchical and flat multi-label text classification approaches over two benchmark datasets, and the experimental results show that our HFT-ONLSTM approach outperforms these approaches, in particular reducing computational costs while achieving superior performance.

NEApr 6, 2020
Joint Embedding of Words and Category Labels for Hierarchical Multi-label Text Classification

Jingpeng Zhao, Yinglong Ma

Text classification has become increasingly challenging due to the continuous refinement of classification label granularity and the expansion of classification label scale. To address that, some research has been applied onto strategies that exploit the hierarchical structure in problems with a large number of categories. At present, hierarchical text classification (HTC) has received extensive attention and has broad application prospects. Making full use of the relationship between parent category and child category in text classification task can greatly improve the performance of classification. In this paper, We propose a joint embedding of text and parent category based on hierarchical fine-tuning ordered neurons LSTM (HFT-ONLSTM) for HTC. Our method makes full use of the connection between the upper-level and lower-level labels. Experiments show that our model outperforms the state-of-the-art hierarchical model at a lower computation cost.