CRSep 2, 2022
Tweaking Metasploit to Evade Encrypted C2 Traffic DetectionGonçalo Xavier, Carlos Novo, Ricardo Morla
Command and Control (C2) communication is a key component of any structured cyber-attack. As such, security operations actively try to detect this type of communication in their networks. This poses a problem for legitimate pentesters that try to remain undetected, since commonly used pentesting tools, such as Metasploit, generate constant traffic patterns that are easily distinguishable from regular web traffic. In this paper we start with these identifiable patterns in Metasploit's C2 traffic and show that a machine learning-based detector is able to detect the presence of such traffic with high accuracy, even when encrypted. We then outline and implement a set of modifications to the Metasploit framework in order to decrease the detection rates of such classifier. To evaluate the performance of these modifications, we use two threat models with increasing awareness of these modifications. We look at the detection evasion performance and at the byte count and runtime overhead of the modifications. Our results show that for the second, increased-awareness threat model the framework-side traffic modifications yield a better detection avoidance rate (90%) than payload-side only modifications (50%). We also show that although the modifications use up to 3 times more TLS payload bytes than the original, the runtime does not significantly change and the total number of bytes (including TLS payload) reduces.
CRSep 2, 2020
Flow-based detection and proxy-based evasion of encrypted malware C2 trafficCarlos Novo, Ricardo Morla
State of the art deep learning techniques are known to be vulnerable to evasion attacks where an adversarial sample is generated from a malign sample and misclassified as benign. Detection of encrypted malware command and control traffic based on TCP/IP flow features can be framed as a learning task and is thus vulnerable to evasion attacks. However, unlike e.g. in image processing where generated adversarial samples can be directly mapped to images, going from flow features to actual TCP/IP packets requires crafting the sequence of packets, with no established approach for such crafting and a limitation on the set of modifiable features that such crafting allows. In this paper we discuss learning and evasion consequences of the gap between generated and crafted adversarial samples. We exemplify with a deep neural network detector trained on a public C2 traffic dataset, white-box adversarial learning, and a proxy-based approach for crafting longer flows. Our results show 1) the high evasion rate obtained by using generated adversarial samples on the detector can be significantly reduced when using crafted adversarial samples; 2) robustness against adversarial samples by model hardening varies according to the crafting approach and corresponding set of modifiable features that the attack allows for; 3) incrementally training hardened models with adversarial samples can produce a level playing field where no detector is best against all attacks and no attack is best against all detectors, in a given set of attacks and detectors. To the best of our knowledge this is the first time that level playing field feature set- and iteration-hardening are analyzed in encrypted C2 malware traffic detection.
CRDec 14, 2019
Ten AI Stepping Stones for CybersecurityRicardo Morla
With the turmoil in cybersecurity and the mind-blowing advances in AI, it is only natural that cybersecurity practitioners consider further employing learning techniques to help secure their organizations and improve the efficiency of their security operation centers. But with great fears come great opportunities for both the good and the evil, and a myriad of bad deals. This paper discusses ten issues in cybersecurity that hopefully will make it easier for practitioners to ask detailed questions about what they want from an AI system in their cybersecurity operations. We draw on the state of the art to provide factual arguments for a discussion on well-established AI in cybersecurity issues, including the current scope of AI and its application to cybersecurity, the impact of privacy concerns on the cybersecurity data that can be collected and shared externally to the organization, how an AI decision can be explained to the person running the operations center, and the implications of the adversarial nature of cybersecurity in the learning techniques. We then discuss the use of AI by attackers on a level playing field including several issues in an AI battlefield, and an AI perspective on the old cat-and-mouse game including how the adversary may assess your AI power.
NIJul 4, 2017
Anomaly Detection and Modeling in 802.11 Wireless NetworksAnisa Allahdadi, Ricardo Morla
IEEE 802.11 Wireless Networks are getting more and more popular at university campuses, enterprises, shopping centers, airports and in so many other public places, providing Internet access to a large crowd openly and quickly. The wireless users are also getting more dependent on WiFi technology and therefore demanding more reliability and higher performance for this vital technology. However, due to unstable radio conditions, faulty equipment, and dynamic user behavior among other reasons, there are always unpredictable performance problems in a wireless covered area. Detection and prediction of such problems is of great significance to network managers if they are to alleviate the connectivity issues of the mobile users and provide a higher quality wireless service. This paper aims to improve the management of the 802.11 wireless networks by characterizing and modeling wireless usage patterns in a set of anomalous scenarios that can occur in such networks. We apply time-invariant (Gaussian Mixture Models) and time-variant (Hidden Markov Models) modeling approaches to a dataset generated from a large production network and describe how we use these models for anomaly detection. We then generate several common anomalies on a Testbed network and evaluate the proposed anomaly detection methodologies in a controlled environment. The experimental results of the Testbed show that HMM outperforms GMM and yields a higher anomaly detection ratio and a lower false alarm rate.
CRJul 3, 2017
Effect of Pipelining and Multiplexing in Estimating HTTP/2.0 Web Object SizesRicardo Morla
HTTP response size is a well-known side channel attack. With the deployment of HTTP/2.0, response size estimation attacks are generally dismissed with the argument that pipelining and response multiplexing prevent eavesdroppers from finding out response sizes. Yet the impact that pipelining and response multiplexing actually have in estimating HTTP response sizes has not been adequately investigated. In this paper we set out to help understand the effect of pipelining and response multiplexing in estimating the size of web objects on the Internet. We conduct an experiment that collects HTTP response sizes and TLS record sizes from 10k popular web sites. We gather evidence on and discuss reasons for the limited amount of pipelining and response multiplexing used on the Internet today: only 29% of the HTTP2 web objects we observe are pipelined and only 5% multiplexed. We also provide worst case results under different attack assumptions and show how effective a simple model for estimating response sizes from TLS record sizes can be. Our conclusion is that pipelining and especially response multiplexing can yield, as expected, a perceivable increase in relative object size estimation error yet the limited extent of multiplexing observed on the Internet today and the relative simplicity of attacks to the current pipelining mechanisms hinder their ability to help prevent web object size estimation.
CRJul 22, 2016
An initial study of the effect of pipelining in hiding HTTP/2.0 response sizesRicardo Morla
HTTP response size is a well-known side channel attack. With the deployment of HTTP/2.0, response size attacks are generally dismissed with the argument that pipelining and response multiplexing prevent eavesdroppers from finding out response sizes. Yet the extent to which pipelining and response multiplexing actually hide HTTP response sizes has not been adequately investigated. In this paper we set out to help understand the effect of pipelining in hiding the size of web objects on the Internet. We conduct an experiment that provides browser-side HTTP response sizes and network-captured TLS record sizes and show how the model that we propose for estimating response sizes from TLS record sizes improves response matching and attack performance. In this process we gather evidence on how different implementations of HTTP/2.0 web servers generate different side- channel information and the limited amount of pipelining and response multiplexing used on the Internet today.
CVSep 29, 2015
Long-Range Trajectories from Global and Local Motion RepresentationsEduardo M. Pereira, Jaime S. Cardoso, Ricardo Morla
Motion is a fundamental cue for scene analysis and human activity understan- ding in videos. It can be encoded in trajectories for tracking objects and for action recognition, or in form of flow to address behaviour analysis in crowded scenes. Each approach can only be applied on limited scenarios. We propose a motion-based system that represents the spatial and temporal features of the flow in terms of long-range trajectories. The novelty resides on the system formulation, its generic approach to handle scene variability and motion variations, motion integration from local and global representations, and the resulting long-range trajectories that overcome trajectory-based approach problems. We report the results and conclusions that state its pertinence on different scenarios, comparing and correlating the extracted trajectories of individual pedestrians, manually annotated. We also propose an evaluation framework and stress the diverse system characteristics that can be used for human activity tasks, namely on motion segmentation.
AIJun 12, 2014
Event and Anomaly Detection Using Tucker3 DecompositionHadi Fanaee-T, Márcia D. B. Oliveira, João Gama et al.
Failure detection in telecommunication networks is a vital task. So far, several supervised and unsupervised solutions have been provided for discovering failures in such networks. Among them unsupervised approaches has attracted more attention since no label data is required. Often, network devices are not able to provide information about the type of failure. In such cases the type of failure is not known in advance and the unsupervised setting is more appropriate for diagnosis. Among unsupervised approaches, Principal Component Analysis (PCA) is a well-known solution which has been widely used in the anomaly detection literature and can be applied to matrix data (e.g. Users-Features). However, one of the important properties of network data is their temporal sequential nature. So considering the interaction of dimensions over a third dimension, such as time, may provide us better insights into the nature of network failures. In this paper we demonstrate the power of three-way analysis to detect events and anomalies in time-evolving network data.