Julio Hernandez-Castro

CR
7papers
90citations
Novelty36%
AI Score21

7 Papers

LGSep 9, 2022
Explanation Method for Anomaly Detection on Mixed Numerical and Categorical Spaces

Iñigo López-Riobóo Botana, Carlos Eiras-Franco, Julio Hernandez-Castro et al.

Most proposals in the anomaly detection field focus exclusively on the detection stage, specially in the recent deep learning approaches. While providing highly accurate predictions, these models often lack transparency, acting as "black boxes". This criticism has grown to the point that explanation is now considered very relevant in terms of acceptability and reliability. In this paper, we addressed this issue by inspecting the ADMNC (Anomaly Detection on Mixed Numerical and Categorical Spaces) model, an existing very accurate although opaque anomaly detector capable to operate with both numerical and categorical inputs. This work presents the extension EADMNC (Explainable Anomaly Detection on Mixed Numerical and Categorical spaces), which adds explainability to the predictions obtained with the original model. We preserved the scalability of the original method thanks to the Apache Spark framework. EADMNC leverages the formulation of the previous ADMNC model to offer pre hoc and post hoc explainability, while maintaining the accuracy of the original architecture. We present a pre hoc model that globally explains the outputs by segmenting input data into homogeneous groups, described with only a few variables. We designed a graphical representation based on regression trees, which supervisors can inspect to understand the differences between normal and anomalous data. Our post hoc explanations consist of a text-based template method that locally provides textual arguments supporting each detection. We report experimental results on extensive real-world data, particularly in the domain of network intrusion detection. The usefulness of the explanations is assessed by theory analysis using expert knowledge in the network intrusion domain.

CRMay 12, 2021
Responding to Living-Off-the-Land Tactics using Just-in-Time Memory Forensics (JIT-MF) for Android

Jennifer Bellizzi, Mark Vella, Christian Colombo et al.

Digital investigations of stealthy attacks on Android devices pose particular challenges to incident responders. Whereas consequential late detection demands accurate and comprehensive forensic timelines to reconstruct all malicious activities, reduced forensic footprints with minimal malware involvement, such as when Living-Off-the-Land (LOtL) tactics are adopted, leave investigators little evidence to work with. Volatile memory forensics can be an effective approach since app execution of any form is always bound to leave a trail of evidence in memory, even if perhaps ephemeral. Just-in-Time Memory Forensics (JIT-MF) is a recently proposed technique that describes a framework to process memory forensics on existing stock Android devices, without compromising their security by requiring them to be rooted. Within this framework, JIT-MF drivers are designed to promptly dump in-memory evidence related to app usage or misuse. In this work, we primarily introduce a conceptualized presentation of JIT-MF drivers. Subsequently, through a series of case studies involving the hijacking of widely-used messaging apps, we show that when the target apps are forensically enhanced with JIT-MF drivers, investigators can generate richer forensic timelines to support their investigation, which are on average 26% closer to ground truth.

CRAug 6, 2020
Intercepting Hail Hydra: Real-Time Detection of Algorithmically Generated Domains

Fran Casino, Nikolaos Lykousas, Ivan Homoliak et al.

A crucial technical challenge for cybercriminals is to keep control over the potentially millions of infected devices that build up their botnets, without compromising the robustness of their attacks. A single, fixed C&C server, for example, can be trivially detected either by binary or traffic analysis and immediately sink-holed or taken-down by security researchers or law enforcement. Botnets often use Domain Generation Algorithms (DGAs), primarily to evade take-down attempts. DGAs can enlarge the lifespan of a malware campaign, thus potentially enhancing its profitability. They can also contribute to hindering attack accountability. In this work, we introduce HYDRAS, the most comprehensive and representative dataset of Algorithmically-Generated Domains (AGD) available to date. The dataset contains more than 100 DGA families, including both real-world and adversarially designed ones. We analyse the dataset and discuss the possibility of differentiating between benign requests (to real domains) and malicious ones (to AGDs) in real-time. The simultaneous study of so many families and variants introduces several challenges; nonetheless, it alleviates biases found in previous literature employing small datasets which are frequently overfitted, exploiting characteristic features of particular families that do not generalise well.We thoroughly compare our approach with the current state-of-the-art and highlight some methodological shortcomings in the actual state of practice. The outcomes obtained show that our proposed approach significantly outperforms the current state-of-the-art in terms of both classification performance and efficiency.

CRMar 20, 2017
Economic Analysis of Ransomware

Julio Hernandez-Castro, Edward Cartwright, Anna Stepanova

We present in this work an economic analysis of ransomware, with relevant data from Cryptolocker, CryptoWall, TeslaCrypt and other major strands. We include a detailed study of the impact that different price discrimination strategies can have on the success of a ransomware family, examining uniform pricing, optimal price discrimination and bargaining strategies and analysing their advantages and limitations. In addition, we present results of a preliminary survey that can helps in estimating an optimal ransom value. We discuss at each stage whether the different schemes we analyse have been encountered already in existing malware, and the likelihood of them being implemented and becoming successful. We hope this work will help to gain some useful insights for predicting how ransomware may evolve in the future and be better prepared to counter its current and future threat.

CRNov 25, 2014
Detecting fraudulent activity in a cloud using privacy-friendly data aggregates

Marc Solanas, Julio Hernandez-Castro, Debojyoti Dutta

More users and companies make use of cloud services every day. They all expect a perfect performance and any issue to remain transparent to them. This last statement is very challenging to perform. A user's activities in our cloud can affect the overall performance of our servers, having an impact on other resources. We can consider these kind of activities as fraudulent. They can be either illegal activities, such as launching a DDoS attack or just activities which are undesired by the cloud provider, such as Bitcoin mining, which uses substantial power, reduces the life of the hardware and can possibly slow down other user's activities. This article discusses a method to detect such activities by using non-intrusive, privacy-friendly data: billing data. We use OpenStack as an example with data provided by Telemetry, the component in charge of measuring resource usage for billing purposes. Results will be shown proving the efficiency of this method and ways to improve it will be provided as well as its advantages and disadvantages.

CROct 11, 2013
Measuring Software Diversity, with Applications to Security

Julio Hernandez-Castro, Jeremy Rossman

In this work, we briefly introduce and discuss some of the diversity measures used in Ecology. After a succinct description and analysis of the most relevant ones, we single out the Shannon-Weiner index. We justify why it is the most informative and relevant one for measuring software diversity. Then, we show how it can be used for effectively assessing the diversity of various real software ecosystems. We discover in the process a frequently overlooked software monopoly, and its key security implications. We finally extract some conclusions from the results obtained, focusing mostly on their security implications.

MMSep 27, 2013
Steganography using the Extensible Messaging and Presence Protocol (XMPP)

Reshad Patuck, Julio Hernandez-Castro

We present here the first work to propose different mechanisms for hiding data in the Extensible Messaging and Presence Protocol (XMPP). This is a very popular instant messaging protocol used by many messaging platforms such as Google Talk, Cisco, LiveJournal and many others. Our paper describes how to send a secret message from one XMPP client to another, without raising the suspicion of any intermediaries. The methods described primarily focus on using the underlying protocol as a means for steganography, unlike other related works that try to hide data in the content of instant messages. In doing so, we provide a more robust means of data hiding and additionally offer some preliminary analysis of its general security, in particular against entropic-based steganalysis.