CLJul 16, 2020

SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER

arXiv:2007.08416v11.026 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of word boundary conflicts and negative word interference in Chinese NER, offering an incremental improvement for natural language processing applications.

The paper tackles the problem of erroneous information from lexical words in Chinese named entity recognition by introducing second-order lexicon knowledge to provide more semantic and word boundary features, resulting in state-of-the-art performance on three public datasets.

Although character-based models using lexicon have achieved promising results for Chinese named entity recognition (NER) task, some lexical words would introduce erroneous information due to wrongly matched words. Existing researches proposed many strategies to integrate lexicon knowledge. However, they performed with simple first-order lexicon knowledge, which provided insufficient word information and still faced the challenge of matched word boundary conflicts; or explored the lexicon knowledge with graph where higher-order information introducing negative words may disturb the identification. To alleviate the above limitations, we present new insight into second-order lexicon knowledge (SLK) of each character in the sentence to provide more lexical word information including semantic and word boundary features. Based on these, we propose a SLK-based model with a novel strategy to integrate the above lexicon knowledge. The proposed model can exploit more discernible lexical words information with the help of global context. Experimental results on three public datasets demonstrate the validity of SLK. The proposed model achieves more excellent performance than the state-of-the-art comparison methods.

View on arXiv PDF Code

Similar