CR CLAug 31, 2022

Application of Data Encryption in Chinese Named Entity Recognition

Kaifang Long, Jikun Dong, Shengyu Fan, Yanfang Geng, Yang Cao, Han Zhao, Hui Yu, Weizhi Xu

arXiv:2208.14627v12.9h-index: 31

Originality Synthesis-oriented

AI Analysis

This addresses data leakage issues in sensitive domains like biomedical and military, though it appears incremental as it applies existing encryption methods to a new task.

The paper tackled the problem of data privacy in named entity recognition by proposing an encryption learning framework that trains deep neural networks on encrypted data, achieving satisfactory results on six Chinese datasets with some models outperforming unencrypted methods.

Recently, with the continuous development of deep learning, the performance of named entity recognition tasks has been dramatically improved. However, the privacy and the confidentiality of data in some specific fields, such as biomedical and military, cause insufficient data to support the training of deep neural networks. In this paper, we propose an encryption learning framework to address the problems of data leakage and inconvenient disclosure of sensitive data in certain domains. We introduce multiple encryption algorithms to encrypt training data in the named entity recognition task for the first time. In other words, we train the deep neural network using the encrypted data. We conduct experiments on six Chinese datasets, three of which are constructed by ourselves. The experimental results show that the encryption method achieves satisfactory results. The performance of some models trained with encrypted data even exceeds the performance of the unencrypted method, which verifies the effectiveness of the introduced encryption method and solves the problem of data leakage to a certain extent.

View on arXiv PDF

Similar