Classification of Encrypted IoT Traffic Despite Padding and Shaping
This work addresses the problem of IoT security for network defenders and adversaries by showing that current traffic obfuscation methods are insufficient, though it is incremental as it builds on known fingerprinting techniques.
The paper demonstrates that encrypted IoT traffic can still be fingerprinted to identify active devices even when padding and shaping defenses are used, achieving at least 96% recall and precision for device identification and 81% accuracy for detecting real activity in 1-second windows.
It is well known that when IoT traffic is unencrypted it is possible to identify the active devices based on their TCP/IP headers. And when traffic is encrypted, packet-sizes and timings can still be used to do so. To defend against such fingerprinting, traffic padding and shaping were introduced. In this paper we demonstrate that the packet-sizes distribution can still be used to successfully fingerprint the active IoT devices when shaping and padding are used, as long as the adversary is aware that these mitigations are deployed, and even if the values of the padding and shaping parameters are unknown. The main tool we use in our analysis is the full distribution of packet-sizes, as opposed to commonly used statistics such as mean and variance. We further show how an external adversary who only sees the padded and shaped traffic as aggregated and hidden behind a NAT middlebox can accurately identify the subset of active devices with Recall and Precision of at least 96%. We also show that the adversary can distinguish time windows containing only bogus cover packets from windows with real device activity, at a granularity of $1sec$ time windows, with 81% accuracy. Using similar methodology, but now on the defender's side, we are also able to detect anomalous activities in IoT traffic due to the Mirai worm.