CLAug 16, 2017

Learning Chinese Word Representations From Glyphs Of Characters

arXiv:1708.04755v139.31101 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of enhancing Chinese word representations for natural language processing tasks, though it appears incremental as it builds on existing character embedding methods.

The paper tackles the problem of learning Chinese word representations by incorporating character glyph features derived from bitmaps using a convolutional auto-encoder, resulting in improved word representations enhanced by character embeddings. It also contributes by creating and releasing several evaluation datasets in traditional Chinese.

In this paper, we propose new methods to learn Chinese word representations. Chinese characters are composed of graphical components, which carry rich semantics. It is common for a Chinese learner to comprehend the meaning of a word from these graphical components. As a result, we propose models that enhance word representations by character glyphs. The character glyph features are directly learned from the bitmaps of characters by convolutional auto-encoder(convAE), and the glyph features improve Chinese word representations which are already enhanced by character embeddings. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public.

View on arXiv PDF Code

Similar