k-fingerprinting: a Robust Scalable Website Fingerprinting Technique
This addresses the problem of privacy breaches for users of encrypted or anonymized networks, showing a scalable and robust attack method.
The paper tackles website fingerprinting by introducing k-fingerprinting, a technique based on random decision forests that achieves an 85% true positive rate and 0.02% false positive rate for identifying monitored hidden services among 100,000 unmonitored pages.
Website fingerprinting enables an attacker to infer which web page a client is browsing through encrypted or anonymized network connections. We present a new website fingerprinting technique based on random decision forests and evaluate performance over standard web pages as well as Tor hidden services, on a larger scale than previous works. Our technique, k-fingerprinting, performs better than current state-of-the-art attacks even against website fingerprinting defenses, and we show that it is possible to launch a website fingerprinting attack in the face of a large amount of noisy data. We can correctly determine which of 30 monitored hidden services a client is visiting with 85% true positive rate (TPR), a false positive rate (FPR) as low as 0.02%, from a world size of 100,000 unmonitored web pages. We further show that error rates vary widely between web resources, and thus some patterns of use will be predictably more vulnerable to attack than others.