LG NE SIMay 25, 2021

HIN-RNN: A Graph Representation Learning Neural Network for Fraudster Group Detection With No Handcrafted Features

Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

arXiv:2105.11602v13.11 citations

Originality Highly original

AI Analysis

This addresses the problem of detecting coordinated fraud in online reviews for consumers and platforms, offering a novel neural approach that is not incremental but introduces a new method for this specific task.

The paper tackles fraudster group detection in social reviews by proposing HIN-RNN, a neural network that eliminates the need for handcrafted features and leverages semantic relations between reviews, achieving marked improvements such as 22% recall and 12% F1-value on Yelp and 4% recall and 2% F1-value on Amazon datasets.

Social reviews are indispensable resources for modern consumers' decision making. For financial gain, companies pay fraudsters preferably in groups to demote or promote products and services since consumers are more likely to be misled by a large number of similar reviews from groups. Recent approaches on fraudster group detection employed handcrafted features of group behaviors without considering the semantic relation between reviews from the reviewers in a group. In this paper, we propose the first neural approach, HIN-RNN, a Heterogeneous Information Network (HIN) Compatible RNN for fraudster group detection that requires no handcrafted features. HIN-RNN provides a unifying architecture for representation learning of each reviewer, with the initial vector as the sum of word embeddings of all review text written by the same reviewer, concatenated by the ratio of negative reviews. Given a co-review network representing reviewers who have reviewed the same items with the same ratings and the reviewers' vector representation, a collaboration matrix is acquired through HIN-RNN training. The proposed approach is confirmed to be effective with marked improvement over state-of-the-art approaches on both the Yelp (22% and 12% in terms of recall and F1-value, respectively) and Amazon (4% and 2% in terms of recall and F1-value, respectively) datasets.

View on arXiv PDF

Similar