Hierarchical classification of e-commerce related social media
This addresses a domain-specific challenge in e-commerce social media analysis, but the results are incremental.
The paper tackled the problem of classifying short, noisy tweets into Amazon product categories using labeled tweets, unlabeled tweets, and Amazon reviews, achieving modest improvements through query and document expansion techniques.
In this paper, we attempt to classify tweets into root categories of the Amazon browse node hierarchy using a set of tweets with browse node ID labels, a much larger set of tweets without labels, and a set of Amazon reviews. Examining twitter data presents unique challenges in that the samples are short (under 140 characters) and often contain misspellings or abbreviations that are trivial for a human to decipher but difficult for a computer to parse. A variety of query and document expansion techniques are implemented in an effort to improve information retrieval to modest success.