13.4CRApr 9
Tracing the Chain: Deep Learning for Stepping-Stone Intrusion DetectionNate Mathews, Nicholas Hopper, Matthew Wright
Stepping-stone intrusions (SSIs) are a prevalent network evasion technique in which attackers route sessions through chains of compromised intermediate hosts to obscure their origin. Effective SSI detection requires correlating the incoming and outgoing flows at each relay host at extremely low false positive rates -- a stringent requirement that renders classical statistical methods inadequate in operational settings. We apply ESPRESSO, a deep learning flow correlation model combining a transformer-based feature extraction network, time-aligned multi-channel interval features, and online triplet metric learning, to the problem of stepping-stone intrusion detection. To support training and evaluation, we develop a synthetic data collection tool that generates realistic stepping-stone traffic across five tunneling protocols: SSH, SOCAT, ICMP, DNS, and mixed multi-protocol chains. Across all five protocols and in both host-mode and network-mode detection scenarios, ESPRESSO substantially outperforms the state-of-the-art DeepCoFFEA baseline, achieving a true positive rate exceeding 0.99 at a false positive rate of $10^{-3}$ for standard bursty protocols in network-mode. We further demonstrate chain length prediction as a tool for distinguishing malicious from benign pivoting, and conduct a systematic robustness analysis revealing that timing-based perturbations are the primary vulnerability of correlation-based stepping-stone detectors.
CRFeb 18, 2019
Mockingbird: Defending Against Deep-Learning-Based Website Fingerprinting Attacks with Adversarial TracesMohammad Saidur Rahman, Mohsen Imani, Nate Mathews et al.
Website Fingerprinting (WF) is a type of traffic analysis attack that enables a local passive eavesdropper to infer the victim's activity, even when the traffic is protected by a VPN or an anonymity system like Tor. Leveraging a deep-learning classifier, a WF attacker can gain over 98% accuracy on Tor traffic. In this paper, we explore a novel defense, Mockingbird, based on the idea of adversarial examples that have been shown to undermine machine-learning classifiers in other domains. Since the attacker gets to design and train his attack classifier based on the defense, we first demonstrate that at a straightforward technique for generating adversarial-example based traces fails to protect against an attacker using adversarial training for robust classification. We then propose Mockingbird, a technique for generating traces that resists adversarial training by moving randomly in the space of viable traces and not following more predictable gradients. The technique drops the accuracy of the state-of-the-art attack hardened with adversarial training from 98% to 42-58% while incurring only 58% bandwidth overhead. The attack accuracy is generally lower than state-of-the-art defenses, and much lower when considering Top-2 accuracy, while incurring lower bandwidth overheads.
CRFeb 18, 2019
Tik-Tok: The Utility of Packet Timing in Website Fingerprinting AttacksMohammad Saidur Rahman, Payap Sirinam, Nate Mathews et al.
A passive local eavesdropper can leverage Website Fingerprinting (WF) to deanonymize the web browsing activity of Tor users. The value of timing information to WF has often been discounted in recent works due to the volatility of low-level timing information. In this paper, we more carefully examine the extent to which packet timing can be used to facilitate WF attacks. We first propose a new set of timing-related features based on burst-level characteristics to further identify more ways that timing patterns could be used by classifiers to identify sites. Then we evaluate the effectiveness of both raw timing and directional timing which is a combination of raw timing and direction in a deep-learning-based WF attack. Our closed-world evaluation shows that directional timing performs best in most of the settings we explored, achieving: (i) 98.4% in undefended Tor traffic; (ii) 93.5% on WTF-PAD traffic, several points higher than when only directional information is used; and (iii) 64.7% against onion sites, 12% higher than using only direction. Further evaluations in the open-world setting show small increases in both precision (+2%) and recall (+6%) with directional-timing on WTF-PAD traffic. To further investigate the value of timing information, we perform an information leakage analysis on our proposed handcrafted features. Our results show that while timing features leak less information than directional features, the information contained in each feature is mutually exclusive to one another and can thus improve the robustness of a classifier.