Nicola Mazzocca

LG
h-index19
11papers
70citations
Novelty39%
AI Score48

11 Papers

SENov 14, 2025
Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

Francesco Vitale, Francesco Flammini, Mauro Caporuscio et al.

Context: Ensuring high levels of dependability in modern computer-based systems has become increasingly challenging due to their complexity. Although systems are validated at design time, their behavior can be different at run-time, possibly showing control-flow anomalies due to "unknown unknowns". Objective: We aim to detect control-flow anomalies through software monitoring, which verifies run-time behavior by logging software execution and detecting deviations from expected control flow. Methods: We propose a methodology to develop software monitors for control-flow anomaly detection through Large Language Models (LLMs) and conformance checking. The methodology builds on existing software development practices to maintain traditional V&V while providing an additional level of robustness and trustworthiness. It leverages LLMs to link design-time models and implementation code, automating source-code instrumentation. The resulting event logs are analyzed via conformance checking, an explainable and effective technique for control-flow anomaly detection. Results: We test the methodology on a case-study scenario from the European Railway Traffic Management System / European Train Control System (ERTMS/ETCS), which is a railway standard for modern interoperable railways. The results obtained from the ERTMS/ETCS case study demonstrate that LLM-based source-code instrumentation can achieve up to 84.775% control-flow coverage of the reference design-time process model, while the subsequent conformance checking-based anomaly detection reaches a peak performance of 96.610% F1-score and 93.515% AUC. Conclusion: Incorporating domain-specific knowledge to guide LLMs in source-code instrumentation significantly allowed obtaining reliable and quality software logs and enabled effective control-flow anomaly detection through conformance checking.

CRApr 20
Dynamic Risk Assessment by Bayesian Attack Graphs and Process Mining

Francesco Vitale, Simone Guarino, Stefano Perone et al.

While attack graphs are useful for identifying major cybersecurity threats affecting a system, they do not provide operational support for determining the likelihood of having a known vulnerability exploited, or that critical system nodes are likely to be compromised. In this paper, we perform dynamic risk assessment by combining Bayesian Attack Graphs (BAGs) and online monitoring of system behavior through process mining. Specifically, the proposed approach applies process mining techniques to characterize malicious network traffic and derive evidence regarding the probability of having a vulnerability actively exploited. This evidence is then provided to a BAG, which updates its conditional probability tables accordingly, enabling dynamic assessment of vulnerability exploitation. We apply our method to a cybersecurity testbed instantiating several machines deployed on different subnets and affected by several CVE vulnerabilities. The testbed is stimulated with both benign traffic and malicious behavior, which simulates network attack patterns aimed at exploiting the CVE vulnerabilities. The results indicate that our proposal effectively detects whether vulnerabilities are being actively exploited, allowing for an updated assessment of the probability of system compromise.

CRApr 20
Enhancing Anomaly-Based Intrusion Detection Systems with Process Mining

Francesco Vitale, Francesco Grimaldi, Massimiliano Rak et al.

Anomaly-based Intrusion Detection Systems (IDSs) ensure protection against malicious attacks on networked systems. While deep learning-based IDSs achieve effective performance, their limited trustworthiness due to black-box architectures remains a critical constraint. Despite existing explainable techniques offering insight into the alarms raised by IDSs, they lack process-based explanations grounded in packet-level sequencing analysis. In this paper, we propose a method that employs process mining techniques to enhance anomaly-based IDSs by providing process-based alarm severity ratings and explanations for alerts. Our method prioritizes critical alerts and maintains visibility into network behavior, while minimizing disruption by allowing misclassified benign traffic to pass. We apply the method to the publicly available USB-IDS-TC dataset, which includes anomalous traffic affected by different variants of the Slowloris DoS attack. Results show that our method is able to discriminate between low- to very-high-severity alarms while preserving up to 99.94% recall and 99.99% precision, effectively discarding false positives while providing different degrees of severity for the true positives.

LGDec 3, 2025
Adaptive Identification and Modeling of Clinical Pathways with Process Mining

Francesco Vitale, Nicola Mazzocca

Clinical pathways are specialized healthcare plans that model patient treatment procedures. They are developed to provide criteria-based progression and standardize patient treatment, thereby improving care, reducing resource use, and accelerating patient recovery. However, manual modeling of these pathways based on clinical guidelines and domain expertise is difficult and may not reflect the actual best practices for different variations or combinations of diseases. We propose a two-phase modeling method using process mining, which extends the knowledge base of clinical pathways by leveraging conformance checking diagnostics. In the first phase, historical data of a given disease is collected to capture treatment in the form of a process model. In the second phase, new data is compared against the reference model to verify conformance. Based on the conformance checking results, the knowledge base can be expanded with more specific models tailored to new variants or disease combinations. We demonstrate our approach using Synthea, a benchmark dataset simulating patient treatments for SARS-CoV-2 infections with varying COVID-19 complications. The results show that our method enables expanding the knowledge base of clinical pathways with sufficient precision, peaking to 95.62% AUC while maintaining an arc-degree simplicity of 67.11%.

LGFeb 14, 2025
Control-flow anomaly detection by process mining-based feature extraction and dimensionality reduction

Francesco Vitale, Marco Pegoraro, Wil M. P. van der Aalst et al.

The business processes of organizations may deviate from normal control flow due to disruptive anomalies, including unknown, skipped, and wrongly-ordered activities. To identify these control-flow anomalies, process mining can check control-flow correctness against a reference process model through conformance checking, an explainable set of algorithms that allows linking any deviations with model elements. However, the effectiveness of conformance checking-based techniques is negatively affected by noisy event data and low-quality process models. To address these shortcomings and support the development of competitive and explainable conformance checking-based techniques for control-flow anomaly detection, we propose a novel process mining-based feature extraction approach with alignment-based conformance checking. This variant aligns the deviating control flow with a reference process model; the resulting alignment can be inspected to extract additional statistics such as the number of times a given activity caused mismatches. We integrate this approach into a flexible and explainable framework for developing techniques for control-flow anomaly detection. The framework combines process mining-based feature extraction and dimensionality reduction to handle high-dimensional feature sets, achieve detection effectiveness, and support explainability. The results show that the framework techniques implementing our approach outperform the baseline conformance checking-based techniques while maintaining the explainable nature of conformance checking. We also provide an explanation of why existing conformance checking-based techniques may be ineffective.

LGSep 12, 2025
Run-Time Monitoring of ERTMS/ETCS Control Flow by Process Mining

Francesco Vitale, Tommaso Zoppi, Francesco Flammini et al.

Ensuring the resilience of computer-based railways is increasingly crucial to account for uncertainties and changes due to the growing complexity and criticality of those systems. Although their software relies on strict verification and validation processes following well-established best-practices and certification standards, anomalies can still occur at run-time due to residual faults, system and environmental modifications that were unknown at design-time, or other emergent cyber-threat scenarios. This paper explores run-time control-flow anomaly detection using process mining to enhance the resilience of ERTMS/ETCS L2 (European Rail Traffic Management System / European Train Control System Level 2). Process mining allows learning the actual control flow of the system from its execution traces, thus enabling run-time monitoring through online conformance checking. In addition, anomaly localization is performed through unsupervised machine learning to link relevant deviations to critical system components. We test our approach on a reference ERTMS/ETCS L2 scenario, namely the RBC/RBC Handover, to show its capability to detect and localize anomalies with high accuracy, efficiency, and explainability.

LGJun 26, 2025
Process mining-driven modeling and simulation to enhance fault diagnosis in cyber-physical systems

Francesco Vitale, Nicola Dall'Ora, Sebastiano Gaiardelli et al.

Fault diagnosis in Cyber-Physical Systems (CPSs) is essential for ensuring system dependability and operational efficiency by accurately detecting anomalies and identifying their root causes. However, the manual modeling of faulty behaviors often demands extensive domain expertise and produces models that are complex, error-prone, and difficult to interpret. To address this challenge, we present a novel unsupervised fault diagnosis methodology that integrates collective anomaly detection in multivariate time series, process mining, and stochastic simulation. Initially, collective anomalies are detected from low-level sensor data using multivariate time-series analysis. These anomalies are then transformed into structured event logs, enabling the discovery of interpretable process models through process mining. By incorporating timing distributions into the extracted Petri nets, the approach supports stochastic simulation of faulty behaviors, thereby enhancing root cause analysis and behavioral understanding. The methodology is validated using the Robotic Arm Dataset (RoAD), a widely recognized benchmark in smart manufacturing. Experimental results demonstrate its effectiveness in modeling, simulating, and classifying faulty behaviors in CPSs. This enables the creation of comprehensive fault dictionaries that support predictive maintenance and the development of digital twins for industrial environments.

DCDec 31, 2021
Virtualization over Multiprocessor System-on-Chip: an Enabling Paradigm for Industrial IoT

Alessandro Cilardo, Marcello Cinque, Luigi De Simone et al.

The next-generation Industrial Internet of Things (IIoT) inherently requires smart devices featuring rich connectivity, local intelligence, and autonomous behavior. Emerging Multiprocessor System-on-Chip (MPSoC) platforms along with comprehensive support for virtualization will represent two key building blocks for smart devices in future IIoT edge infrastructures. We review representative existing solutions, highlighting the aspects that are most relevant for integration in IIoT solutions. From the analysis, we derive a reference architecture for a general virtualization-ready edge IIoT node. We then analyze the implications and benefits for a concrete use case scenario and identify the crucial research challenges to be faced to bridge the gap towards full support for virtualization-ready IIoT nodes

SEMay 12, 2014
Semantic Support for Log Analysis of Safety-Critical Embedded Systems

Alessio Venticinque, Nicola Mazzocca, Salvatore Venticinque et al.

Testing is a relevant activity for the development life-cycle of Safety Critical Embedded systems. In particular, much effort is spent for analysis and classification of test logs from SCADA subsystems, especially when failures occur. The human expertise is needful to understand the reasons of failures, for tracing back the errors, as well as to understand which requirements are affected by errors and which ones will be affected by eventual changes in the system design. Semantic techniques and full text search are used to support human experts for the analysis and classification of test logs, in order to speedup and improve the diagnosis phase. Moreover, retrieval of tests and requirements, which can be related to the current failure, is supported in order to allow the discovery of available alternatives and solutions for a better and faster investigation of the problem.

SEApr 16, 2013
A new modeling approach to the safety evaluation of N-modular redundant computer systems in presence of imperfect maintenance

Francesco Flammini, Stefano Marrone, Nicola Mazzocca et al.

A large number of safety-critical control systems are based on N-modular redundant architectures, using majority voters on the outputs of independent computation units. In order to assess the compliance of these architectures with international safety standards, the frequency of hazardous failures must be analyzed by developing and solving proper formal models. Furthermore, the impact of maintenance faults has to be considered, since imperfect maintenance may degrade the safety integrity level of the system. In this paper we present both a failure model for voting architectures based on Bayesian Networks and a maintenance model based on Continuous Time Markov Chains, and we propose to combine them according to a compositional multiformalism modeling approach in order to analyze the impact of imperfect maintenance on the system safety. We also show how the proposed approach promotes the reuse and the interchange of models as well the interchange of solving tools.

SEMar 12, 2013
Automatic instantiation of abstract tests on specific configurations for large critical control systems

Francesco Flammini, Nicola Mazzocca, Antonio Orazzo

Computer-based control systems have grown in size, complexity, distribution and criticality. In this paper a methodology is presented to perform an abstract testing of such large control systems in an efficient way: an abstract test is specified directly from system functional requirements and has to be instantiated in more test runs to cover a specific configuration, comprising any number of control entities (sensors, actuators and logic processes). Such a process is usually performed by hand for each installation of the control system, requiring a considerable time effort and being an error prone verification activity. To automate a safe passage from abstract tests, related to the so called generic software application, to any specific installation, an algorithm is provided, starting from a reference architecture and a state-based behavioural model of the control software. The presented approach has been applied to a railway interlocking system, demonstrating its feasibility and effectiveness in several years of testing experience.