Improving Opinion-Target Extraction with Character-Level Word Embeddings
This work addresses the challenge of fine-grained sentiment analysis for applications like customer reviews, but it is incremental as it builds on existing methods with a specific enhancement.
The paper tackled the problem of extracting opinion target expressions from noisy user-generated text by integrating character-level word embeddings into a sequence labeling system, resulting in a 3.3-point F1-score improvement over the baseline.
Fine-grained sentiment analysis is receiving increasing attention in recent years. Extracting opinion target expressions (OTE) in reviews is often an important step in fine-grained, aspect-based sentiment analysis. Retrieving this information from user-generated text, however, can be difficult. Customer reviews, for instance, are prone to contain misspelled words and are difficult to process due to their domain-specific language. In this work, we investigate whether character-level models can improve the performance for the identification of opinion target expressions. We integrate information about the character structure of a word into a sequence labeling system using character-level word embeddings and show their positive impact on the system's performance. Specifically, we obtain an increase by 3.3 points F1-score with respect to our baseline model. In further experiments, we reveal encoded character patterns of the learned embeddings and give a nuanced view of the performance differences of both models.