CLApr 14, 2020
Incorporating Uncertain Segmentation Information into Chinese NER for Social Media TextShengbin Jia, Ling Ding, Xiaojun Chen et al.
Chinese word segmentation is necessary to provide word-level information for Chinese named entity recognition (NER) systems. However, segmentation error propagation is a challenge for Chinese NER while processing colloquial data like social media text. In this paper, we propose a model (UIcwsNN) that specializes in identifying entities from Chinese social media text, especially by leveraging ambiguous information of word segmentation. Such uncertain information contains all the potential segmentation states of a sentence that provides a channel for the model to infer deep word-level characteristics. We propose a trilogy (i.e., candidate position embedding -> position selective attention -> adaptive word convolution) to encode uncertain word segmentation information and acquire appropriate word-level representation. Experiments results on the social media corpus show that our model alleviates the segmentation error cascading trouble effectively, and achieves a significant performance improvement of more than 2% over previous state-of-the-art methods.
CLFeb 23, 2020
Unique Chinese Linguistic PhenomenaShengbin Jia
Linguistics holds unique characteristics of generality, stability, and nationality, which will affect the formulation of extraction strategies and should be incorporated into the relation extraction. Chinese open relation extraction is not well-established, because of the complexity of Chinese linguistics makes it harder to operate, and the methods for English are not compatible with that for Chinese. The diversities between Chinese and English linguistics are mainly reflected in morphology and syntax.
CLJul 26, 2019
Hybrid Neural Tagging Model for Open Relation ExtractionShengbin Jia, Yang Xiang
Open relation extraction (ORE) remains a challenge to obtain a semantic representation by discovering arbitrary relation tuples from the unstructured text. Conventional methods heavily depend on feature engineering or syntactic parsing, they are inefficient or error-cascading. Recently, leveraging supervised deep learning structures to address the ORE task is an extraordinarily promising way. However, there are two main challenges: (1) The lack of enough labeled corpus to support supervised training; (2) The exploration of specific neural architecture that adapts to the characteristics of open relation extracting. In this paper, to overcome these difficulties, we build a large-scale, high-quality training corpus in a fully automated way, and design a tagging scheme to assist in transforming the ORE task into a sequence tagging processing. Furthermore, we propose a hybrid neural network model (HNN4ORT) for open relation tagging. The model employs the Ordered Neurons LSTM to encode potential syntactic information for capturing the associations among the arguments and relations. It also emerges a novel Dual Aware Mechanism, including Local-aware Attention and Global-aware Convolution. The dual aware nesses complement each other so that the model can take the sentence-level semantics as a global perspective, and at the same time implement salient local features to achieve sparse annotation. Experimental results on various testing sets show that our model can achieve state-of-the-art performances compared to the conventional methods or other neural models.
AISep 25, 2018
Triple Trustworthiness Measurement for Knowledge GraphShengbin Jia, Yang Xiang, Xiaojun Chen
The Knowledge graph (KG) uses the triples to describe the facts in the real world. It has been widely used in intelligent analysis and applications. However, possible noises and conflicts are inevitably introduced in the process of constructing. And the KG based tasks or applications assume that the knowledge in the KG is completely correct and inevitably bring about potential deviations. In this paper, we establish a knowledge graph triple trustworthiness measurement model that quantify their semantic correctness and the true degree of the facts expressed. The model is a crisscrossing neural network structure. It synthesizes the internal semantic information in the triples and the global inference information of the KG to achieve the trustworthiness measurement and fusion in the three levels of entity level, relationship level, and KG global level. We analyzed the validity of the model output confidence values, and conducted experiments in the real-world dataset FB15K (from Freebase) for the knowledge graph error detection task. The experimental results showed that compared with other models, our model achieved significant and consistent improvements.
AISep 25, 2018
Chinese User Service Intention Classification Based on Hybrid Neural NetworkShengbin Jia, Yang Xiang
In order to satisfy the consumers' increasing personalized service demand, the Intelligent service has arisen. User service intention recognition is an important challenge for intelligent service system to provide precise service. It is difficult for the intelligent system to understand the semantics of user demand which leads to poor recognition effect, because of the noise in user requirement descriptions. Therefore, a hybrid neural network classification model based on BiLSTM and CNN is proposed to recognize users service intentions. The model can fuse the temporal semantics and spatial semantics of the user descriptions. The experimental results show that our model achieves a better effect compared with other models, reaching 0.94 on the F1 score.