Graph-Hist: Graph Classification from Latent Feature Histograms With Application to Bot Detection
This addresses bot detection in social media, an incremental improvement over existing methods for a domain-specific problem.
The paper tackled graph classification for social media networks, which are large and sparse unlike typical benchmarks, by proposing Graph-Hist, an end-to-end architecture that uses latent feature histograms; it improved state-of-the-art performance on social media benchmarks and outperformed existing bot-detection models in detecting bots through conversational graphs.
Neural networks are increasingly used for graph classification in a variety of contexts. Social media is a critical application area in this space, however the characteristics of social media graphs differ from those seen in most popular benchmark datasets. Social networks tend to be large and sparse, while benchmarks are small and dense. Classically, large and sparse networks are analyzed by studying the distribution of local properties. Inspired by this, we introduce Graph-Hist: an end-to-end architecture that extracts a graph's latent local features, bins nodes together along 1-D cross sections of the feature space, and classifies the graph based on this multi-channel histogram. We show that Graph-Hist improves state of the art performance on true social media benchmark datasets, while still performing well on other benchmarks. Finally, we demonstrate Graph-Hist's performance by conducting bot detection in social media. While sophisticated bot and cyborg accounts increasingly evade traditional detection methods, they leave artificial artifacts in their conversational graph that are detected through graph classification. We apply Graph-Hist to classify these conversational graphs. In the process, we confirm that social media graphs are different than most baselines and that Graph-Hist outperforms existing bot-detection models.