Error Analysis for Vietnamese Named Entity Recognition on Deep Neural Network Models
This work addresses error analysis for Vietnamese NER, which is incremental as it focuses on improving existing deep learning models for a specific language domain.
The paper analyzed errors in state-of-the-art Vietnamese Named Entity Recognition (NER) systems, finding that the BLSTM-CNN-CRF model performed better than BLSTM-CRF, and provided insights to enhance performance and corpus quality.
In recent years, Vietnamese Named Entity Recognition (NER) systems have had a great breakthrough when using Deep Neural Network methods. This paper describes the primary errors of the state-of-the-art NER systems on Vietnamese language. After conducting experiments on BLSTM-CNN-CRF and BLSTM-CRF models with different word embeddings on the Vietnamese NER dataset. This dataset is provided by VLSP in 2016 and used to evaluate most of the current Vietnamese NER systems. We noticed that BLSTM-CNN-CRF gives better results, therefore, we analyze the errors on this model in detail. Our error-analysis results provide us thorough insights in order to increase the performance of NER for the Vietnamese language and improve the quality of the corpus in the future works.