LG CL SIMar 6, 2014

Inducing Language Networks from Continuous Space Word Representations

Bryan Perozzi, Rami Al-Rfou, Vivek Kulkarni, Steven Skiena

arXiv:1403.1252v213 citations

Originality Incremental advance

AI Analysis

This work addresses a fundamental challenge in unsupervised learning for natural language processing by providing insights into latent space structures, though it is incremental as it builds on existing representation methods.

The paper tackles the problem of understanding and evaluating continuous space word representations by introducing a new perspective that views them as language networks, and finds that these induced networks exhibit meaningful community structure and differ from existing network creation methods.

Recent advancements in unsupervised feature learning have developed powerful latent representations of words. However, it is still not clear what makes one representation better than another and how we can learn the ideal representation. Understanding the structure of latent spaces attained is key to any future advancement in unsupervised learning. In this work, we introduce a new view of continuous space word representations as language networks. We explore two techniques to create language networks from learned features by inducing them for two popular word representation methods and examining the properties of their resulting networks. We find that the induced networks differ from other methods of creating language networks, and that they contain meaningful community structure.

View on arXiv PDF

Similar