CLApr 11, 2018

Evaluating Word Embedding Hyper-Parameters for Similarity and Analogy Tasks

arXiv:1804.04211v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses a methodological gap for NLP researchers, but it is incremental as it builds on existing evaluation frameworks.

The study tackled the problem of poorly understood hyper-parameter impacts on word embeddings by empirically measuring how vector dimensions and corpus size affect quality, finding specific effects on similarity and analogy tasks.

The versatility of word embeddings for various applications is attracting researchers from various fields. However, the impact of hyper-parameters when training embedding model is often poorly understood. How much do hyper-parameters such as vector dimensions and corpus size affect the quality of embeddings, and how do these results translate to downstream applications? Using standard embedding evaluation metrics and datasets, we conduct a study to empirically measure the impact of these hyper-parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes