Perceptual Hashing applied to Tor domains recognition
This work addresses the need for efficient and accurate automated monitoring of illegal content on the Tor darknet for cybersecurity agencies, representing an incremental improvement over existing methods.
The paper tackles the problem of automatically classifying Tor darknet domains by their screenshots to monitor illegal content, introducing a new perceptual hashing method called F-DNS that achieves 98.75% accuracy on a dataset of Tor images, outperforming other state-of-the-art methods.
The Tor darknet hosts different types of illegal content, which are monitored by cybersecurity agencies. However, manually classifying Tor content can be slow and error-prone. To support this task, we introduce Frequency-Dominant Neighborhood Structure (F-DNS), a new perceptual hashing method for automatically classifying domains by their screenshots. First, we evaluated F-DNS using images subject to various content preserving operations. We compared them with their original images, achieving better correlation coefficients than other state-of-the-art methods, especially in the case of rotation. Then, we applied F-DNS to categorize Tor domains using the Darknet Usage Service Images-2K (DUSI-2K), a dataset with screenshots of active Tor service domains. Finally, we measured the performance of F-DNS against an image classification approach and a state-of-the-art hashing method. Our proposal obtained 98.75% accuracy in Tor images, surpassing all other methods compared.