Lisheng Gao

0.8CLFeb 9, 2020

Application of Pre-training Models in Named Entity Recognition

Yu Wang, Yining Sun, Zuchang Ma et al.

Named Entity Recognition (NER) is a fundamental Natural Language Processing (NLP) task to extract entities from unstructured data. The previous methods for NER were based on machine learning or deep learning. Recently, pre-training models have significantly improved performance on multiple NLP tasks. In this paper, firstly, we introduce the architecture and pre-training tasks of four common pre-training models: BERT, ERNIE, ERNIE2.0-tiny, and RoBERTa. Then, we apply these pre-training models to a NER task by fine-tuning, and compare the effects of the different model architecture and pre-training tasks on the NER task. The experiment results showed that RoBERTa achieved state-of-the-art results on the MSRA-2006 dataset.

4.3LGNov 15, 2016

Deep Restricted Boltzmann Networks

Hengyuan Hu, Lisheng Gao, Quanbin Ma

Building a good generative model for image has long been an important topic in computer vision and machine learning. Restricted Boltzmann machine (RBM) is one of such models that is simple but powerful. However, its restricted form also has placed heavy constraints on the models representation power and scalability. Many extensions have been invented based on RBM in order to produce deeper architectures with greater power. The most famous ones among them are deep belief network, which stacks multiple layer-wise pretrained RBMs to form a hybrid model, and deep Boltzmann machine, which allows connections between hidden units to form a multi-layer structure. In this paper, we present a new method to compose RBMs to form a multi-layer network style architecture and a training method that trains all layers jointly. We call the resulted structure deep restricted Boltzmann network. We further explore the combination of convolutional RBM with the normal fully connected RBM, which is made trivial under our composition framework. Experiments show that our model can generate descent images and outperform the normal RBM significantly in terms of image quality and feature quality, without losing much efficiency for training.

Lisheng Gao

2 Papers