Diego Kreutz

CR
h-index7
13papers
61citations
Novelty33%
AI Score48

13 Papers

CRNov 1, 2025Code
MalDataGen: A Modular Framework for Synthetic Tabular Data Generation in Malware Detection

Kayua Oleques Paim, Angelo Gaspar Diniz Nogueira, Diego Kreutz et al.

High-quality data scarcity hinders malware detection, limiting ML performance. We introduce MalDataGen, an open-source modular framework for generating high-fidelity synthetic tabular data using modular deep learning models (e.g., WGAN-GP, VQ-VAE). Evaluated via dual validation (TR-TS/TS-TR), seven classifiers, and utility metrics, MalDataGen outperforms benchmarks like SDV while preserving data utility. Its flexible design enables seamless integration into detection pipelines, offering a practical solution for cybersecurity applications.

CRNov 1, 2025
MH-1M: A 1.34 Million-Sample Comprehensive Multi-Feature Android Malware Dataset for Machine Learning, Deep Learning, Large Language Models, and Threat Intelligence Research

Hendrio Braganca, Diego Kreutz, Vanderson Rocha et al.

We present MH-1M, one of the most comprehensive and up-to-date datasets for advanced Android malware research. The dataset comprises 1,340,515 applications, encompassing a wide range of features and extensive metadata. To ensure accurate malware classification, we employ the VirusTotal API, integrating multiple detection engines for comprehensive and reliable assessment. Our GitHub, Figshare, and Harvard Dataverse repositories provide open access to the processed dataset and its extensive supplementary metadata, totaling more than 400 GB of data and including the outputs of the feature extraction pipeline as well as the corresponding VirusTotal reports. Our findings underscore the MH-1M dataset's invaluable role in understanding the evolving landscape of malware.

CRNov 18, 2025Code
On-Premise SLMs vs. Commercial LLMs: Prompt Engineering and Incident Classification in SOCs and CSIRTs

Gefté Almeida, Marcio Pohlmann, Alex Severo et al.

In this study, we evaluate open-source models for security incident classification, comparing them with proprietary models. We utilize a dataset of anonymized real incidents, categorized according to the NIST SP 800-61r3 taxonomy and processed using five prompt-engineering techniques (PHP, SHP, HTP, PRP, and ZSL). The results indicate that, although proprietary models still exhibit higher accuracy, locally deployed open-source models provide advantages in privacy, cost-effectiveness, and data sovereignty.

CRNov 1, 2025
Exploiting Latent Space Discontinuities for Building Universal LLM Jailbreaks and Data Extraction Attacks

Kayua Oleques Paim, Rodrigo Brandao Mansilha, Diego Kreutz et al.

The rapid proliferation of Large Language Models (LLMs) has raised significant concerns about their security against adversarial attacks. In this work, we propose a novel approach to crafting universal jailbreaks and data extraction attacks by exploiting latent space discontinuities, an architectural vulnerability related to the sparsity of training data. Unlike previous methods, our technique generalizes across various models and interfaces, proving highly effective in seven state-of-the-art LLMs and one image generation model. Initial results indicate that when these discontinuities are exploited, they can consistently and profoundly compromise model behavior, even in the presence of layered defenses. The findings suggest that this strategy has substantial potential as a systemic attack vector.

CRApr 5
NetSecBed: A Container-Native Testbed for Reproducible Cybersecurity Experimentation

Leonardo Bitzki, Diego Kreutz, Tiago Heinrich et al.

Cybersecurity research increasingly depends on reproducible evidence, such as traffic traces, logs, and labeled datasets, yet most public datasets remain static and offer limited support for controlled re-execution and traceability, especially in heterogeneous multi-protocol environments. This paper presents NetSecBed, a container-native, scenario-oriented testbed for reproducible generation of network traffic evidence and execution artifacts under controlled conditions, particularly suitable for IoT, IIoT, and pervasive multi-protocol environments. The framework integrates 60 attack scenarios, 9 target services, and benign traffic generators as single-purpose containers, enabling plug-and-play extensibility and traceability through declarative specifications. Its pipeline automates parametrized execution, packet capture, log collection, service probing, feature extraction, and dataset consolidation. The main contribution is a repeatable, auditable, and extensible framework for cybersecurity experimentation that reduces operational bias and supports continuous dataset generation.

AINov 20, 2025
Reducing Instability in Synthetic Data Evaluation with a Super-Metric in MalDataGen

Anna Luiza Gomes da Silva, Diego Kreutz, Angelo Diniz et al.

Evaluating the quality of synthetic data remains a persistent challenge in the Android malware domain due to instability and the lack of standardization among existing metrics. This work integrates into MalDataGen a Super-Metric that aggregates eight metrics across four fidelity dimensions, producing a single weighted score. Experiments involving ten generative models and five balanced datasets demonstrate that the Super-Metric is more stable and consistent than traditional metrics, exhibiting stronger correlations with the actual performance of classifiers.

CRNov 28, 2025
IoTEdu: Access Control, Detection, and Automatic Incident Response in Academic IoT Networks

Joner Assolin, Diego Kreutz, Leandro Bertholdo

The growing presence of IoT devices in academic environments has increased operational complexity and exposed security weaknesses, especially in academic institutions without unified policies for registration, monitoring, and incident response involving IoT. This work presents IoTEdu, an integrated platform that combines access control, incident detection, and automatic blocking of IoT devices. The solution was evaluated in a controlled environment with simulated attacks, achieving an average time of 28.6 seconds between detection and blocking. The results show a reduction in manual intervention, standardization of responses, and unification of the processes of registration, monitoring, and incident response.

CRNov 25, 2025
A Taxonomy of Pix Fraud in Brazil: Attack Methodologies, AI-Driven Amplification, and Defensive Strategies

Glener Lanes Pizzolato, Brenda Medeiros Lopes, Claudio Schepke et al.

This work presents a review of attack methodologies targeting Pix, the instant payment system launched by the Central Bank of Brazil in 2020. The study aims to identify and classify the main types of fraud affecting users and financial institutions, highlighting the evolution and increasing sophistication of these techniques. The methodology combines a structured literature review with exploratory interviews conducted with professionals from the banking sector. The results show that fraud schemes have evolved from purely social engineering approaches to hybrid strategies that integrate human manipulation with technical exploitation. The study concludes that security measures must advance at the same pace as the growing complexity of attack methodologies, with particular emphasis on adaptive defenses and continuous user awareness.

CRNov 24, 2025
Synthetic Data: AI's New Weapon Against Android Malware

Angelo Gaspar Diniz Nogueira, Kayua Oleques Paim, Hendrio Bragança et al.

The ever-increasing number of Android devices and the accelerated evolution of malware, reaching over 35 million samples by 2024, highlight the critical importance of effective detection methods. Attackers are now using Artificial Intelligence to create sophisticated malware variations that can easily evade traditional detection techniques. Although machine learning has shown promise in malware classification, its success relies heavily on the availability of up-to-date, high-quality datasets. The scarcity and high cost of obtaining and labeling real malware samples presents significant challenges in developing robust detection models. In this paper, we propose MalSynGen, a Malware Synthetic Data Generation methodology that uses a conditional Generative Adversarial Network (cGAN) to generate synthetic tabular data. This data preserves the statistical properties of real-world data and improves the performance of Android malware classifiers. We evaluated the effectiveness of this approach using various datasets and metrics that assess the fidelity of the generated data, its utility in classification, and the computational efficiency of the process. Our experiments demonstrate that MalSynGen can generalize across different datasets, providing a viable solution to address the issues of obsolescence and low quality data in malware detection.

DCNov 21, 2025
Temperature in SLMs: Impact on Incident Categorization in On-Premises Environments

Marcio Pohlmann, Alex Severo, Gefté Almeida et al.

SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks. We investigate whether locally executed SLMs can meet this challenge. We evaluated 21 models ranging from 1B to 20B parameters, varying the temperature hyperparameter and measuring execution time and precision across two distinct architectures. The results indicate that temperature has little influence on performance, whereas the number of parameters and GPU capacity are decisive factors.

LGJul 11, 2025
MH-FSF: A Unified Framework for Overcoming Benchmarking and Reproducibility Limitations in Feature Selection Evaluation

Vanderson Rocha, Diego Kreutz, Gabriel Canto et al.

Feature selection is vital for building effective predictive models, as it reduces dimensionality and emphasizes key features. However, current research often suffers from limited benchmarking and reliance on proprietary datasets. This severely hinders reproducibility and can negatively impact overall performance. To address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor. By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection.

CRJun 29, 2025
Interpretable by Design: MH-AutoML for Transparent and Efficient Android Malware Detection without Compromising Performance

Joner Assolin, Gabriel Canto, Diego Kreutz et al.

Malware detection in Android systems requires both cybersecurity expertise and machine learning (ML) techniques. Automated Machine Learning (AutoML) has emerged as an approach to simplify ML development by reducing the need for specialized knowledge. However, current AutoML solutions typically operate as black-box systems with limited transparency, interpretability, and experiment traceability. To address these limitations, we present MH-AutoML, a domain-specific framework for Android malware detection. MH-AutoML automates the entire ML pipeline, including data preprocessing, feature engineering, algorithm selection, and hyperparameter tuning. The framework incorporates capabilities for interpretability, debugging, and experiment tracking that are often missing in general-purpose solutions. In this study, we compare MH-AutoML against seven established AutoML frameworks: Auto-Sklearn, AutoGluon, TPOT, HyperGBM, Auto-PyTorch, LightAutoML, and MLJAR. Results show that MH-AutoML achieves better recall rates while providing more transparency and control. The framework maintains computational efficiency comparable to other solutions, making it suitable for cybersecurity applications where both performance and explainability matter.

NINov 9, 2017
ANCHOR: logically-centralized security for Software-Defined Networks

Diego Kreutz, Jiangshan Yu, Fernando M. V. Ramos et al.

While the centralization of SDN brought advantages such as a faster pace of innovation, it also disrupted some of the natural defenses of traditional architectures against different threats. The literature on SDN has mostly been concerned with the functional side, despite some specific works concerning non-functional properties like 'security' or 'dependability'. Though addressing the latter in an ad-hoc, piecemeal way, may work, it will most likely lead to efficiency and effectiveness problems. We claim that the enforcement of non-functional properties as a pillar of SDN robustness calls for a systemic approach. As a general concept, we propose ANCHOR, a subsystem architecture that promotes the logical centralization of non-functional properties. To show the effectiveness of the concept, we focus on 'security' in this paper: we identify the current security gaps in SDNs and we populate the architecture middleware with the appropriate security mechanisms, in a global and consistent manner. Essential security mechanisms provided by anchor include reliable entropy and resilient pseudo-random generators, and protocols for secure registration and association of SDN devices. We claim and justify in the paper that centralizing such mechanisms is key for their effectiveness, by allowing us to: define and enforce global policies for those properties; reduce the complexity of controllers and forwarding devices; ensure higher levels of robustness for critical services; foster interoperability of the non-functional property enforcement mechanisms; and promote the security and resilience of the architecture itself. We discuss design and implementation aspects, and we prove and evaluate our algorithms and mechanisms, including the formalisation of the main protocols and the verification of their core security properties using the Tamarin prover.