CLLGApr 19, 2019

Zero-Shot Cross-Lingual Opinion Target Extraction

arXiv:1904.09122v11092 citations
Originality Incremental advance
AI Analysis

This addresses the problem of limited annotated corpora for aspect-based sentiment analysis in specific languages, offering a practical solution for cross-lingual NLP applications, though it is incremental in leveraging existing multilingual techniques.

The paper tackles the lack of annotated data for opinion target extraction in many languages by proposing a zero-shot cross-lingual approach using multilingual embeddings and a CNN, achieving up to 77% of target-language model performance with single-source training and up to 87% with multi-source training.

Aspect-based sentiment analysis involves the recognition of so called opinion target expressions (OTEs). To automatically extract OTEs, supervised learning algorithms are usually employed which are trained on manually annotated corpora. The creation of these corpora is labor-intensive and sufficiently large datasets are therefore usually only available for a very narrow selection of languages and domains. In this work, we address the lack of available annotated data for specific languages by proposing a zero-shot cross-lingual approach for the extraction of opinion target expressions. We leverage multilingual word embeddings that share a common vector space across various languages and incorporate these into a convolutional neural network architecture for OTE extraction. Our experiments with 5 languages give promising results: We can successfully train a model on annotated data of a source language and perform accurate prediction on a target language without ever using any annotated samples in that target language. Depending on the source and target language pairs, we reach performances in a zero-shot regime of up to 77% of a model trained on target language data. Furthermore, we can increase this performance up to 87% of a baseline model trained on target language data by performing cross-lingual learning from multiple source languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes