CRMar 19, 2019

BotGraph: Web Bot Detection Based on Sitemap

arXiv:1903.08074v27 citations
AI Analysis

This addresses the issue of bot traffic consumption and site scraping for web service providers, representing a domain-specific incremental improvement.

The paper tackles the problem of detecting web bots that forge identities to bypass traditional signature-based detection by proposing BotGraph, a behavior-based scheme that combines sitemap and CNN, achieving ~95% recall and precision on 35-day production data traces.

The web bots have been blamed for consuming large amount of Internet traffic and undermining the interest of the scraped sites for years. Traditional bot detection studies focus mainly on signature-based solution, but advanced bots usually forge their identities to bypass such detection. With increasing cloud migration, cloud providers provide new opportunities for an effective bot detection based on big data to solve this issue. In this paper, we present a behavior-based bot detection scheme called BotGraph that combines sitemap and convolutional neural network (CNN) to detect inner behavior of bots. Experimental results show that BotGraph achieves ~95% recall and precision on 35-day production data traces from different customers including the Bing search engine and several sites.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes