LGIRSIJul 21, 2022

Modeling User Behavior With Interaction Networks for Spam Detection

Stanford
arXiv:2207.10767v118 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses spam detection for large digital platforms, offering a pragmatic solution for production systems, though it appears incremental as it builds on existing graph-based methods.

The paper tackles the problem of spam detection on web-scale platforms by proposing SEINE, a model that uses interaction networks to capture user behavior, achieving 80% recall with 1% false positive rate on a real dataset.

Spam is a serious problem plaguing web-scale digital platforms which facilitate user content creation and distribution. It compromises platform's integrity, performance of services like recommendation and search, and overall business. Spammers engage in a variety of abusive and evasive behavior which are distinct from non-spammers. Users' complex behavior can be well represented by a heterogeneous graph rich with node and edge attributes. Learning to identify spammers in such a graph for a web-scale platform is challenging because of its structural complexity and size. In this paper, we propose SEINE (Spam DEtection using Interaction NEtworks), a spam detection model over a novel graph framework. Our graph simultaneously captures rich users' details and behavior and enables learning on a billion-scale graph. Our model considers neighborhood along with edge types and attributes, allowing it to capture a wide range of spammers. SEINE, trained on a real dataset of tens of millions of nodes and billions of edges, achieves a high performance of 80% recall with 1% false positive rate. SEINE achieves comparable performance to the state-of-the-art techniques on a public dataset while being pragmatic to be used in a large-scale production system.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes