An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese
This work addresses sentiment analysis for Japanese text, providing an incremental improvement by demonstrating the efficiency of transfer learning in a low-resource language context.
The paper tackled sentiment analysis in Japanese by applying transfer learning to binary and multi-class classification on product and movie review datasets, showing that it outperforms task-specific models trained on three times more data and matches performance with only 1/30 of the pre-training data.
Text classification approaches have usually required task-specific model architectures and huge labeled datasets. Recently, thanks to the rise of text-based transfer learning techniques, it is possible to pre-train a language model in an unsupervised manner and leverage them to perform effective on downstream tasks. In this work we focus on Japanese and show the potential use of transfer learning techniques in text classification. Specifically, we perform binary and multi-class sentiment classification on the Rakuten product review and Yahoo movie review datasets. We show that transfer learning-based approaches perform better than task-specific models trained on 3 times as much data. Furthermore, these approaches perform just as well for language modeling pre-trained on only 1/30 of the data. We release our pre-trained models and code as open source.