Ricardo Bessa

CR
3papers
1citation
Novelty25%
AI Score33

3 Papers

38.8CRApr 13
Towards Automated Pentesting with Large Language Models

Ricardo Bessa, Rui Claro, João Trindade et al.

Large Language Models (LLMs) are redefining offensive cybersecurity by allowing the generation of harmful machine code with minimal human intervention. While attackers take advantage of dark LLMs such as XXXGPT and WolfGPT to produce malicious code, ethical hackers can follow similar approaches to automate traditional pentesting workflows. In this work, we present RedShell, a privacy-preserving, hardware-efficient framework that leverages fine-tuned LLMs to assist pentesters in generating offensive PowerShell code targeting Microsoft Windows vulnerabilities. RedShell was trained on a malicious PowerShell dataset from the literature, which we further enhanced with manually curated code samples. Experiments show that our framework achieves over 90% syntactic validity in generated samples and strong semantic alignment with reference pentesting snippets, outperforming state-of-the-art counterparts in distance metrics such as edit distance (above 50% average code similarity). Additionally, functional experiments emphasize the execution reliability of the snippets produced by RedShell in a testing scenario that mirrors real-world settings. This work sheds light on the state-of-the-art research in the field of Generative AI applied to malicious code generation and automated testing, acknowledging the potential benefits that LLMs hold within controlled environments such as pentesting.

6.8CRApr 13
RedShell: A Generative AI-Based Approach to Ethical Hacking

Ricardo Bessa, Rui Claro, João Trindade et al.

The application of Machine Learning techniques in code generation is now a common practice for most developers. Tools such as ChatGPT from OpenAI leverage the natural language processing capabilities of Large Language Models to generate machine code from natural language descriptions. In the cybersecurity field, red teams can also take advantage of generative models to build malicious code generators, providing more automation to Pentest audits. However, the application of Large Language Models in malicious code generation remains challenging due to the lack of data to train and evaluate offensive code generators. In this work, we propose RedShell, a tool that allows ethical hackers to generate malicious PowerShell code. We also introduce a ground truth dataset, combining publicly available code samples to fine-tune models in malicious PowerShell generation. Our experiments demonstrate the strong capabilities of RedShell in generating syntactically valid PowerShell, with fewer than 10% of the generated samples resulting in parse errors. Furthermore, our specialized model was able to produce samples that were semantically consistent with reference snippets, achieving a competitive performance on standard output similarity metrics such as Edit Distance and METEOR, with their mean similarity scores exceeding 50% and 40%, respectively. This work sheds light on the state-of-the-art research in the field of Generative AI applied to Pentesting, and also serves as a steppingstone for future advancements, highlighting the potential benefits these models hold within such controlled environments.

NEMar 26, 2017
Multi-Period Flexibility Forecast for Low Voltage Prosumers

Rui Pinto, Ricardo Bessa, Manuel Matos

Near-future electric distribution grids operation will have to rely on demand-side flexibility, both by implementation of demand response strategies and by taking advantage of the intelligent management of increasingly common small-scale energy storage. The Home energy management system (HEMS), installed at low voltage residential clients, will play a crucial role on the flexibility provision to both system operators and market players like aggregators. Modeling and forecasting multi-period flexibility from residential prosumers, such as battery storage and electric water heater, while complying with internal constraints (comfort levels, data privacy) and uncertainty is a complex task. This papers describes a computational method that is capable of efficiently learn and define the feasibility flexibility space from controllable resources connected to a HEMS. An Evolutionary Particle Swarm Optimization (EPSO) algorithm is adopted and reshaped to derive a set of feasible temporal trajectories for the residential net-load, considering storage, flexible appliances, and predefined costumer preferences, as well as load and photovoltaic (PV) forecast uncertainty. A support vector data description (SVDD) algorithm is used to build models capable of classifying feasible and non-feasible HEMS operating trajectories upon request from an optimization/control algorithm operated by a DSO or market player.