João Soares

CR
h-index7
3papers
1citation
Novelty42%
AI Score39

3 Papers

DCApr 20
Trust, but Verify: ByzTwin-Range, a Digital Twin Cyber-Range for Byzantine Faults

Tadeu Freitas, João Soares, Rolando Martins

Critical infrastructures increasingly rely on interconnected and software-driven Cyber-Physical Systems (CPS), exposing operational processes to both accidental failures and sophisticated adversarial behavior. While Byzantine Fault Tolerant (BFT) protocols offer robustness against arbitrary faults, evaluating their behavior under realistic cyber-physical conditions remains challenging: traditional cyber ranges lack timing fidelity, and testing in production environments is unsafe. This paper introduces ByzTwin-Range, a dual-layer architecture that integrates a production-grade BFT deployment with a Digital Twin (DT) to enable controlled experimentation, stress testing, and Byzantine fault injection using live operational data. The DT mirrors real system state, executes "What-if" analyses through co-simulation and emulation, and identifies synchrony vulnerabilities, i.e., misconfigured timeouts, timing-sensitive false suspicions, and adversarial delay exploits, configuration weaknesses, and adversarial behaviors that may undermine BFT guarantees. Insights from the twin are fed back into the operational deployment through a secure advisory channel, supporting continuous validation and adaptive hardening. The proposed design leverages industry-standard technologies (Open Platform Communications Unified Architecture, Time-Sensitive Networking, Functional Mock-up Unit/High-Level Architecture, QUIC/mutual TLS) to maximize feasibility and compatibility with existing industrial workflows. ByzTwin-Range establishes a practical foundation for next-generation, BFT-aware cyber ranges and paves the way for more resilient CPSs through continuous testing, differential-privacy-enabled analytics, and future proof-of-concept implementations.

CROct 8, 2025
RedTWIZ: Diverse LLM Red Teaming via Adaptive Attack Planning

Artur Horal, Daniel Pina, Henrique Paz et al.

This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework, to audit the robustness of Large Language Models (LLMs) in AI-assisted software development. Our work is driven by three major research streams: (1) robust and systematic assessment of LLM conversational jailbreaks; (2) a diverse generative multi-turn attack suite, supporting compositional, realistic and goal-oriented jailbreak conversational strategies; and (3) a hierarchical attack planner, which adaptively plans, serializes, and triggers attacks tailored to specific LLM's vulnerabilities. Together, these contributions form a unified framework -- combining assessment, attack generation, and strategic planning -- to comprehensively evaluate and expose weaknesses in LLMs' robustness. Extensive evaluation is conducted to systematically assess and analyze the performance of the overall system and each component. Experimental results demonstrate that our multi-turn adversarial attack strategies can successfully lead state-of-the-art LLMs to produce unsafe generations, highlighting the pressing need for more research into enhancing LLM's robustness.

CRAug 18, 2025
A Risk Manager for Intrusion Tolerant Systems: Enhancing HAL 9000 with New Scoring and Data Sources

Tadeu Freitas, Carlos Novo, Inês Dutra et al.

Intrusion Tolerant Systems (ITSs) have become increasingly critical due to the rise of multi-domain adversaries exploiting diverse attack surfaces. ITS architectures aim to tolerate intrusions, ensuring system compromise is prevented or mitigated even with adversary presence. Existing ITS solutions often employ Risk Managers leveraging public security intelligence to adjust system defenses dynamically against emerging threats. However, these approaches rely heavily on databases like NVD and ExploitDB, which require manual analysis for newly discovered vulnerabilities. This dependency limits the system's responsiveness to rapidly evolving threats. HAL 9000, an ITS Risk Manager introduced in our prior work, addressed these challenges through machine learning. By analyzing descriptions of known vulnerabilities, HAL 9000 predicts and assesses new vulnerabilities automatically. To calculate the risk of a system, it also incorporates the Exploitability Probability Scoring system to estimate the likelihood of exploitation within 30 days, enhancing proactive defense capabilities. Despite its success, HAL 9000's reliance on NVD and ExploitDB knowledge is a limitation, considering the availability of other sources of information. This extended work introduces a custom-built scraper that continuously mines diverse threat sources, including security advisories, research forums, and real-time exploit proofs-of-concept. This significantly expands HAL 9000's intelligence base, enabling earlier detection and assessment of unverified vulnerabilities. Our evaluation demonstrates that integrating scraper-derived intelligence with HAL 9000's risk management framework substantially improves its ability to address emerging threats. This paper details the scraper's integration into the architecture, its role in providing additional information on new threats, and the effects on HAL 9000's management.