CL AI IR LGJul 7, 2021

Identifying Hijacked Reviews

arXiv:2107.05385v1711 citations

Originality Incremental advance

AI Analysis

This addresses the growing issue of fake reviews in online marketplaces, providing a method to detect a specific manipulation tactic, though it is incremental as it builds on existing models for a new problem.

The paper tackled the problem of review hijacking, a manipulation tactic where sellers alter product details on existing pages to mislead consumers, by developing a framework to generate synthetic data and evaluating models to detect hijacked reviews, resulting in the identification of hundreds of previously unknown cases in a dataset of 31K products.

Fake reviews and review manipulation are growing problems on online marketplaces globally. Review Hijacking is a new review manipulation tactic in which unethical sellers "hijack" an existing product page (usually one with many positive reviews), then update the product details like title, photo, and description with those of an entirely different product. With the earlier reviews still attached, the new item appears well-reviewed. However, there are no public datasets of review hijacking and little is known in the literature about this tactic. Hence, this paper proposes a three-part study: (i) we propose a framework to generate synthetically labeled data for review hijacking by swapping products and reviews; (ii) then, we evaluate the potential of both a Twin LSTM network and BERT sequence pair classifier to distinguish legitimate reviews from hijacked ones using this data; and (iii) we then deploy the best performing model on a collection of 31K products (with 6.5 M reviews) in the original data, where we find 100s of previously unknown examples of review hijacking.

View on arXiv PDF

Similar