CYCRSINov 4, 2018

Structure and Content of the Visible Darknet

arXiv:1811.01348v218 citations
Originality Synthesis-oriented
AI Analysis

This study provides empirical insights into the structure and content of the darknet, addressing a gap in understanding for cybersecurity and law enforcement communities, though it is incremental in applying existing methods to this domain.

The authors analyzed the topology and content of the Tor-accessible darknet by crawling over 34,000 hidden services, finding 10,000 online, and discovered it is well-connected through hubs like wikis and forums. Using supervised machine learning, they categorized content and observed that about half appears licit, with unlawful activities including fraud, counterfeit goods, and drug markets.

In this paper, we analyze the topology and the content found on the "darknet", the set of websites accessible via Tor. We created a darknet spider and crawled the darknet starting from a bootstrap list by recursively following links. We explored the whole connected component of more than 34,000 hidden services, of which we found 10,000 to be online. Contrary to folklore belief, the visible part of the darknet is surprisingly well-connected through hub websites such as wikis and forums. We performed a comprehensive categorization of the content using supervised machine learning. We observe that about half of the visible dark web content is related to apparently licit activities based on our classifier. A significant amount of content pertains to software repositories, blogs, and activism-related websites. Among unlawful hidden services, most pertain to fraudulent websites, services selling counterfeit goods, and drug markets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes