CLLGMLSep 5, 2019

Fusing Vector Space Models for Domain-Specific Applications

arXiv:1909.02307v110 citations
Originality Incremental advance
AI Analysis

This addresses the need for better domain-specific word embeddings in natural language processing applications, though it appears incremental as it builds on existing embedding techniques.

The paper tackles the problem of tuning word embeddings for specific domains by proposing a method that automatically combines multiple domain-specific embeddings to improve their expressive power, resulting in embeddings that consistently enhance the performance of state-of-the-art machine learning algorithms on multiple tasks compared to generic embeddings.

We address the problem of tuning word embeddings for specific use cases and domains. We propose a new method that automatically combines multiple domain-specific embeddings, selected from a wide range of pre-trained domain-specific embeddings, to improve their combined expressive power. Our approach relies on two key components: 1) a ranking function, based on a new embedding similarity measure, that selects the most relevant embeddings to use given a domain and 2) a dimensionality reduction method that combines the selected embeddings to produce a more compact and efficient encoding that preserves the expressiveness. We empirically show that our method produces effective domain-specific embeddings that consistently improve the performance of state-of-the-art machine learning algorithms on multiple tasks, compared to generic embeddings trained on large text corpora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes