Rethinking Attribute Representation and Injection for Sentiment Classification
This work addresses the challenge of attribute representation for sentiment classification in product reviews, offering a more effective approach that contradicts prior assumptions and improves performance.
The paper tackled the problem of effectively incorporating text attributes like user and product information into sentiment classification models, showing that the standard method of using attributes as biases in attention is least effective, and their proposed method with chunk-wise importance weight matrices achieves significant improvements and outperforms state-of-the-art models.
Text attributes, such as user and product information in product reviews, have been used to improve the performance of sentiment classification models. The de facto standard method is to incorporate them as additional biases in the attention mechanism, and more performance gains are achieved by extending the model architecture. In this paper, we show that the above method is the least effective way to represent and inject attributes. To demonstrate this hypothesis, unlike previous models with complicated architectures, we limit our base model to a simple BiLSTM with attention classifier, and instead focus on how and where the attributes should be incorporated in the model. We propose to represent attributes as chunk-wise importance weight matrices and consider four locations in the model (i.e., embedding, encoding, attention, classifier) to inject attributes. Experiments show that our proposed method achieves significant improvements over the standard approach and that attention mechanism is the worst location to inject attributes, contradicting prior work. We also outperform the state-of-the-art despite our use of a simple base model. Finally, we show that these representations transfer well to other tasks. Model implementation and datasets are released here: https://github.com/rktamplayo/CHIM.