CRDec 18, 2025
Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box WatermarkingSangeeth B, Serena Nicolazzo, Deepa K. et al.
The rapid proliferation of deep neural networks (DNNs) across several domains has led to increasing concerns regarding intellectual property (IP) protection and model misuse. Trained DNNs represent valuable assets, often developed through significant investments. However, the ease with which models can be copied, redistributed, or repurposed highlights the urgent need for effective mechanisms to assert and verify model ownership. In this work, we propose an efficient and resilient white-box watermarking framework that embeds ownership information into the internal parameters of a DNN using chaotic sequences. The watermark is generated using a logistic map, a well-known chaotic function, producing a sequence that is sensitive to its initialization parameters. This sequence is injected into the weights of a chosen intermediate layer without requiring structural modifications to the model or degradation in predictive performance. To validate ownership, we introduce a verification process based on a genetic algorithm that recovers the original chaotic parameters by optimizing the similarity between the extracted and regenerated sequences. The effectiveness of the proposed approach is demonstrated through extensive experiments on image classification tasks using MNIST and CIFAR-10 datasets. The results show that the embedded watermark remains detectable after fine-tuning, with negligible loss in model accuracy. In addition to numerical recovery of the watermark, we perform visual analyses using weight density plots and construct activation-based classifiers to distinguish between original, watermarked, and tampered models. Overall, the proposed method offers a flexible and scalable solution for embedding and verifying model ownership in white-box settings well-suited for real-world scenarios where IP protection is critical.
CRJan 7, 2024
Privacy-Preserving in Blockchain-based Federated Learning SystemsSameera K. M., Serena Nicolazzo, Marco Arazzi et al.
Federated Learning (FL) has recently arisen as a revolutionary approach to collaborative training Machine Learning models. According to this novel framework, multiple participants train a global model collaboratively, coordinating with a central aggregator without sharing their local data. As FL gains popularity in diverse domains, security, and privacy concerns arise due to the distributed nature of this solution. Therefore, integrating this strategy with Blockchain technology has been consolidated as a preferred choice to ensure the privacy and security of participants. This paper explores the research efforts carried out by the scientific community to define privacy solutions in scenarios adopting Blockchain-Enabled FL. It comprehensively summarizes the background related to FL and Blockchain, evaluates existing architectures for their integration, and the primary attacks and possible countermeasures to guarantee privacy in this setting. Finally, it reviews the main application scenarios where Blockchain-Enabled FL approaches have been proficiently applied. This survey can help academia and industry practitioners understand which theories and techniques exist to improve the performance of FL through Blockchain to preserve privacy and which are the main challenges and future directions in this novel and still under-explored context. We believe this work provides a novel contribution respect to the previous surveys and is a valuable tool to explore the current landscape, understand perspectives, and pave the way for advancements or improvements in this amalgamation of Blockchain and Federated Learning.
CRApr 22
Towards Certified Malware Detection: Provable Guarantees Against Evasion AttacksNandakrishna Giri, Asmitha K. A., Serena Nicolazzo et al.
Machine learning-based static malware detectors remain vulnerable to adversarial evasion techniques, such as metamorphic engine mutations. To address this vulnerability, we propose a certifiably robust malware detection framework based on randomized smoothing through feature ablation and targeted noise injection. During evaluation, our system analyzes an executable by generating multiple ablated variants, classifies them by using a smoothed classifier, and identifies the final label based on the majority vote. By analyzing the top-class voting distribution and the Wilson score interval, we derive a formal certificate that guarantees robustness within a specific radius against feature-space perturbations. We evaluate our approach by comparing the performance of the base classifier and the smoothed classifier on both clean executables and ablated variants generated using PyMetaEngine. Our results demonstrate that the proposed smoothed classifier successfully provides certifiable robustness against metamorphic evasion attacks without requiring modifications to the underlying machine learning architecture.
CRApr 30, 2025
XBreaking: Understanding how LLMs security alignment can be brokenMarco Arazzi, Vignesh Kumar Kembu, Antonino Nocera et al.
Large Language Models are fundamental actors in the modern IT landscape dominated by AI solutions. However, security threats associated with them might prevent their reliable adoption in critical application scenarios such as government organizations and medical institutions. For this reason, commercial LLMs typically undergo a sophisticated censoring mechanism to eliminate any harmful output they could possibly produce. These mechanisms maintain the integrity of LLM alignment by guaranteeing that the models respond safely and ethically. In response to this, attacks on LLMs are a significant threat to such protections, and many previous approaches have already demonstrated their effectiveness across diverse domains. Existing LLM attacks mostly adopt a generate-and-test strategy to craft malicious input. To improve the comprehension of censoring mechanisms and design a targeted attack, we propose an Explainable-AI solution that comparatively analyzes the behavior of censored and uncensored models to derive unique exploitable alignment patterns. Then, we propose XBreaking, a novel approach that exploits these unique patterns to break the security and alignment constraints of LLMs by targeted noise injection. Our thorough experimental campaign returns important insights about the censoring mechanisms and demonstrates the effectiveness and performance of our approach.
CRSep 27, 2021
GANG-MAM: GAN based enGine for Modifying Android MalwareRenjith G, Sonia Laudanna, Aji S et al.
Malware detectors based on machine learning are vulnerable to adversarial attacks. Generative Adversarial Networks (GAN) are architectures based on Neural Networks that could produce successful adversarial samples. The interest towards this technology is quickly growing. In this paper, we propose a system that produces a feature vector for making an Android malware strongly evasive and then modify the malicious program accordingly. Such a system could have a twofold contribution: it could be used to generate datasets to validate systems for detecting GAN-based malware and to enlarge the training and testing dataset for making more robust malware classifiers.
CRApr 20, 2019
Can Machine Learning Model with Static Features be Fooled: an Adversarial Machine Learning ApproachRahim Taheri, Reza Javidan, Mohammad Shojafar et al.
The widespread adoption of smartphones dramatically increases the risk of attacks and the spread of mobile malware, especially on the Android platform. Machine learning-based solutions have been already used as a tool to supersede signature-based anti-malware systems. However, malware authors leverage features from malicious and legitimate samples to estimate statistical difference in-order to create adversarial examples. Hence, to evaluate the vulnerability of machine learning algorithms in malware detection, we propose five different attack scenarios to perturb malicious applications (apps). By doing this, the classification algorithm inappropriately fits the discriminant function on the set of data points, eventually yielding a higher misclassification rate. Further, to distinguish the adversarial examples from benign samples, we propose two defense mechanisms to counter attacks. To validate our attacks and solutions, we test our model on three different benchmark datasets. We also test our methods using various classifier algorithms and compare them with the state-of-the-art data poisoning method using the Jacobian matrix. Promising results show that generated adversarial samples can evade detection with a very high probability. Additionally, evasive variants generated by our attack models when used to harden the developed anti-malware system improves the detection rate up to 50% when using the Generative Adversarial Network (GAN) method.
CRSep 17, 2018
FeatureAnalytics: An approach to derive relevant attributes for analyzing Android MalwareDeepa K, Radhamani G, Vinod P et al.
Ever increasing number of Android malware, has always been a concern for cybersecurity professionals. Even though plenty of anti-malware solutions exist, a rational and pragmatic approach for the same is rare and has to be inspected further. In this paper, we propose a novel two-set feature selection approach based on Rough Set and Statistical Test named as RSST to extract relevant system calls. To address the problem of higher dimensional attribute set, we derived suboptimal system call space by applying the proposed feature selection method to maximize the separability between malware and benign samples. Comprehensive experiments conducted on a dataset consisting of 3500 samples with 30 RSST derived essential system calls resulted in an accuracy of 99.9%, Area Under Curve (AUC) of 1.0, with 1% False Positive Rate (FPR). However, other feature selectors (Information Gain, CFsSubsetEval, ChiSquare, FreqSel and Symmetric Uncertainty) used in the domain of malware analysis resulted in the accuracy of 95.5% with 8.5% FPR. Besides, empirical analysis of RSST derived system calls outperform other attributes such as permissions, opcodes, API, methods, call graphs, Droidbox attributes and network traces.