Chinese Embedding via Stroke and Glyph Information: A Dual-channel View
This work addresses the challenge of improving Chinese word embeddings for NLP applications by leveraging morphological information, though it appears incremental as it builds on existing hints about morphology's utility.
The authors tackled the problem of enriching Chinese word embeddings by incorporating both sequential stroke order and spatial glyph information, proposing a Dual-channel Word Embedding (DWE) model that showed superiority in word similarity and analogy tasks.
Recent studies have consistently given positive hints that morphology is helpful in enriching word embeddings. In this paper, we argue that Chinese word embeddings can be substantially enriched by the morphological information hidden in characters which is reflected not only in strokes order sequentially, but also in character glyphs spatially. Then, we propose a novel Dual-channel Word Embedding (DWE) model to realize the joint learning of sequential and spatial information of characters. Through the evaluation on both word similarity and word analogy tasks, our model shows its rationality and superiority in modelling the morphology of Chinese.