Early Identification of Services in HTTPS Traffic
This work addresses network management challenges for security and QoS by enabling early service identification in encrypted HTTPS traffic, though it is incremental as it builds on existing methods without decryption.
The paper tackles the problem of identifying HTTPS services without decryption by proposing a machine learning method that uses statistical features from TLS handshake and early application data packets, achieving good accuracy in early session identification as confirmed by experiments on an open dataset.
Traffic monitoring is essential for network management tasks that ensure security and QoS. However, the continuous increase of HTTPS traffic undermines the effectiveness of current service-level monitoring that can only rely on unreliable parameters from the TLS handshake (X.509 certificate, SNI) or must decrypt the traffic. We propose a new machine learning-based method to identify HTTPS services without decryption. By extracting statistical features on TLS handshake packets and on a small number of application data packets, we can identify HTTPS services very early in the session. Extensive experiments performed over a significant and open dataset show that our method offers a good accuracy and a prototype implementation confirms that the early identification of HTTPS services is satisfied.