Construction of Vietnamese SentiWordNet by using Vietnamese Dictionary
This addresses the problem of sentiment analysis for Vietnamese language users by providing a lexical resource, though it is incremental as it adapts existing methods to a new language context.
The paper tackles the lack of a Vietnamese SentiWordNet by proposing a method to construct it from a Vietnamese dictionary instead of WordNet, resulting in a VSWN with 39,561 synsets and competitive differences of 0.066 and 0.052 for positivity and negativity compared to English SentiWordNet.
SentiWordNet is an important lexical resource supporting sentiment analysis in opinion mining applications. In this paper, we propose a novel approach to construct a Vietnamese SentiWordNet (VSWN). SentiWordNet is typically generated from WordNet in which each synset has numerical scores to indicate its opinion polarities. Many previous studies obtained these scores by applying a machine learning method to WordNet. However, Vietnamese WordNet is not available unfortunately by the time of this paper. Therefore, we propose a method to construct VSWN from a Vietnamese dictionary, not from WordNet. We show the effectiveness of the proposed method by generating a VSWN with 39,561 synsets automatically. The method is experimentally tested with 266 synsets with aspect of positivity and negativity. It attains a competitive result compared with English SentiWordNet that is 0.066 and 0.052 differences for positivity and negativity sets respectively.