CRMay 23
MultiPhishGuard: An Explainable and Adaptive Multi-Agent LLM System for Phishing Email DetectionYinuo Xue, Eric Spero, Meng Wai Woo et al.
Phishing email detection faces significant challenges due to evolving adversarial tactics and heterogeneous attack patterns. Traditional approaches, such as rule-based filters and denylists, often struggle to keep pace, leading to missed detections and security risks. While machine learning methods have improved detection performance, they remain limited in adapting to novel and rapidly changing phishing strategies. We present MultiPhishGuard, an LLM-based multi-agent detection framework with learned coordination across specialized agents. The system consists of five cooperative agents (text, URL, metadata, explanation simplifier, and adversarial agents), with agent contributions dynamically weighted using Proximal Policy Optimization. To address emerging threats, the framework incorporates an adversarial training loop in which an LLM-based agent generates subtle, context-aware email variants to expose potential model weaknesses and improve robustness to ambiguous phishing cases. Experimental evaluations on public datasets show that MultiPhishGuard achieves stronger performance than established baselines, including Chain-of-Thought prompting and single-agent variants, as supported by ablation studies and comparative analyses. The system achieves an accuracy of 97.89%, with a false positive rate of 2.73% and a false negative rate of 0.20%. In addition, an explanation simplifier agent transforms technical model outputs into plain-language rationales intended for human users. Overall, these results suggest that multi-agent LLM architectures with adaptive coordination and adversarial training represent a promising direction for phishing email detection.
LGOct 23, 2023
Zero-Knowledge Proof-based Verifiable Decentralized Machine Learning in Communication Network: A Comprehensive SurveyZhibo Xing, Zijian Zhang, Ziang Zhang et al.
Over recent decades, machine learning has significantly advanced network communication, enabling improved decision-making, user behavior analysis, and fault detection. Decentralized approaches, where participants exchange computation results instead of raw private data, mitigate these risks but introduce challenges related to trust and verifiability. A critical issue arises: How can one ensure the integrity and validity of computation results shared by other participants? Existing survey articles predominantly address security and privacy concerns in decentralized machine learning, whereas this survey uniquely highlights the emerging issue of verifiability. Recognizing the critical role of zero-knowledge proofs in ensuring verifiability, we present a comprehensive review of Zero-Knowledge Proof-based Verifiable Machine Learning (ZKP-VML). To clarify the research problem, we present a definition of ZKP-VML consisting of four algorithms, along with several corresponding key security properties. Besides, we provide an overview of the current research landscape by systematically organizing the research timeline and categorizing existing schemes based on their security properties. Furthermore, through an in-depth analysis of each existing scheme, we summarize their technical contributions and optimization strategies, aiming to uncover common design principles underlying ZKP-VML schemes. Building on the reviews and analysis presented, we identify current research challenges and suggest future research directions. To the best of our knowledge, this is the most comprehensive survey to date on verifiable decentralized machine learning and ZKP-VML.
CRMay 29
R+R: Reassessing Java Security API Misuse in Current LLMs: A Replication on JCA and JSSE APIs with External Security KnowledgeTianhe Lu, Eric Spero, Sakuna Harinda Jayasundara et al.
The misuse of Java security APIs is a serious security problem in software development. Research in 2024 has shown that this problem is widespread in LLM-generated code. However, it remains unclear whether this phenomenon persists in current models and how external security knowledge affects it. This paper presents a scoped replication and extension of Mousavi et al.'s study on the Java Cryptography Architecture (JCA) and Java Secure Socket Extension (JSSE) APIs. We focus on two complementary settings: GPT-5.5 as a frontier proprietary coding model, and Llama-3.3-70B-Instruct as a strong open-weight model relevant to self-hosted deployment. The results show that although newer LLMs perform better in using Java security APIs, the problem of Java security API misuse has not been eliminated. External security knowledge substantially improves the measured outcome, but its effect is model-dependent. For Llama-3.3-70B-Instruct, secure code examples are the most effective single knowledge type. For GPT-5.5, explicit misuse patterns eliminate all detected security API misuses among valid programs in our benchmark, although some outputs remain invalid due to compilation errors or target-API mismatches. In addition, developer-guide knowledge becomes much more effective, and secure prompting also provides large gains for GPT-5.5. Overall, these findings confirm the Java security API misuse risk identified in the original study and show that the benefits of retrieval-augmented knowledge depend not only on the knowledge itself and retrieval behavior, but also on model capability.
CRSep 8, 2024
RAGent: Retrieval-based Access Control Policy GenerationSakuna Harinda Jayasundara, Nalin Asanka Gamagedara Arachchilage, Giovanni Russello
Manually generating access control policies from an organization's high-level requirement specifications poses significant challenges. It requires laborious efforts to sift through multiple documents containing such specifications and translate their access requirements into access control policies. Also, the complexities and ambiguities of these specifications often result in errors by system administrators during the translation process, leading to data breaches. However, the automated policy generation frameworks designed to help administrators in this process are unreliable due to limitations, such as the lack of domain adaptation. Therefore, to improve the reliability of access control policy generation, we propose RAGent, a novel retrieval-based access control policy generation framework based on language models. RAGent identifies access requirements from high-level requirement specifications with an average state-of-the-art F1 score of 87.9%. Through retrieval augmented generation, RAGent then translates the identified access requirements into access control policies with an F1 score of 77.9%. Unlike existing frameworks, RAGent generates policies with complex components like purposes and conditions, in addition to subjects, actions, and resources. Moreover, RAGent automatically verifies the generated policies and iteratively refines them through a novel verification-refinement mechanism, further improving the reliability of the process by 3%, reaching the F1 score of 80.6%. We also introduce three annotated datasets for developing access control policy generation frameworks in the future, addressing the data scarcity of the domain.
CROct 5, 2023
SoK: Access Control Policy Generation from High-level Natural Language RequirementsSakuna Harinda Jayasundara, Nalin Asanka Gamagedara Arachchilage, Giovanni Russello
Administrator-centered access control failures can cause data breaches, putting organizations at risk of financial loss and reputation damage. Existing graphical policy configuration tools and automated policy generation frameworks attempt to help administrators configure and generate access control policies by avoiding such failures. However, graphical policy configuration tools are prone to human errors, making them unusable. On the other hand, automated policy generation frameworks are prone to erroneous predictions, making them unreliable. Therefore, to find ways to improve their usability and reliability, we conducted a Systematic Literature Review analyzing 49 publications, to identify those tools, frameworks, and their limitations. Identifying those limitations will help develop effective access control policy generation solutions while avoiding access control failures.
CRJun 3, 2024
No Vandalism: Privacy-Preserving and Byzantine-Robust Federated LearningZhibo Xing, Zijian Zhang, Zi'ang Zhang et al.
Federated learning allows several clients to train one machine learning model jointly without sharing private data, providing privacy protection. However, traditional federated learning is vulnerable to poisoning attacks, which can not only decrease the model performance, but also implant malicious backdoors. In addition, direct submission of local model parameters can also lead to the privacy leakage of the training dataset. In this paper, we aim to build a privacy-preserving and Byzantine-robust federated learning scheme to provide an environment with no vandalism (NoV) against attacks from malicious participants. Specifically, we construct a model filter for poisoned local models, protecting the global model from data and model poisoning attacks. This model filter combines zero-knowledge proofs to provide further privacy protection. Then, we adopt secret sharing to provide verifiable secure aggregation, removing malicious clients that disrupting the aggregation process. Our formal analysis proves that NoV can protect data privacy and weed out Byzantine attackers. Our experiments illustrate that NoV can effectively address data and model poisoning attacks, including PGD, and outperforms other related schemes.
CRFeb 16, 2022
SoK: Human-Centered Phishing SusceptibilitySijie Zhuo, Robert Biddle, Yun Sing Koh et al.
Phishing is recognised as a serious threat to organisations and individuals. While there have been significant technical advances in blocking phishing attacks, people remain the last line of defence after phishing emails reach their email client. Most of the existing literature on this subject has focused on the technical aspects related to phishing. However, the factors that cause humans to be susceptible to phishing attacks are still not well-understood. To fill this gap, we reviewed the available literature and we propose a three-stage Phishing Susceptibility Model (PSM) for explaining how humans are involved in phishing detection and prevention, and we systematically investigate the phishing susceptibility variables studied in the literature and taxonomize them using our model. This model reveals several research gaps that need to be addressed to improve users' detection performance. We also propose a practical impact assessment of the value of studying the phishing susceptibility variables, and quality of evidence criteria. These can serve as guidelines for future research to improve experiment design, result quality, and increase the reliability and generalizability of findings.
CRNov 5, 2020
Towards a Theory of Special-purpose Program ObfuscationMuhammad Rizwan Asghar, Steven Galbraith, Andrea Lanzi et al.
Most recent theoretical literature on program obfuscation is based on notions like Virtual Black Box (VBB) obfuscation and indistinguishability Obfuscation (iO). These notions are very strong and are hard to satisfy. Further, they offer far more protection than is typically required in practical applications. On the other hand, the security notions introduced by software security researchers are suitable for practical designs but are not formal or precise enough to enable researchers to provide a quantitative security assurance. Hence, in this paper, we introduce a new formalism for practical program obfuscation that still allows rigorous security proofs. We believe our formalism will make it easier to analyse the security of obfuscation schemes. To show the flexibility and power of our formalism, we give a number of examples. Moreover, we explain the close relationship between our formalism and the task of providing obfuscation challenges. This is the full version of the paper. In this version, we also give a new rigorous analysis of several obfuscation techniques and we provide directions for future research.
CRSep 25, 2019
Privacy-preserving Searchable Databases with Controllable LeakageShujie Cui, Xiangfu Song, Muhammad Rizwan Asghar et al.
Searchable Encryption (SE) is a technique that allows Cloud Service Providers (CSPs) to search over encrypted datasets without learning the content of queries and records. In recent years, many SE schemes have been proposed to protect outsourced data from CSPs. Unfortunately, most of them leak sensitive information, from which the CSPs could still infer the content of queries and records by mounting leakage-based inference attacks, such as the count attack and file injection attack. In this work, first we define the leakage in searchable encrypted databases and analyse how the leakage is leveraged in existing leakage-based attacks. Second, we propose a Privacy-preserving Multi-cloud based dynamic symmetric SE (SSE) scheme for relational Database (P-McDb). P-McDb has minimal leakage, which not only ensures confidentiality of queries and records, but also protects the search, access, and size patterns from CSPs. Moreover, P-McDb ensures both forward and backward privacy of the database. Thus, P-McDb could resist existing leakage-based attacks, e.g., active file/record-injection attacks. We give security definition and analysis to show how P-McDb hides the aforementioned patterns. Finally, we implemented a prototype of P-McDb and test it using the TPC-H benchmark dataset. Our evaluation results show the feasibility and practical efficiency of P-McDb.
CRAug 15, 2013
ESPOON$_{ERBAC}$: Enforcing Security Policies In Outsourced EnvironmentsMuhammad Rizwan Asghar, Mihaela Ion, Giovanni Russello et al.
Data outsourcing is a growing business model offering services to individuals and enterprises for processing and storing a huge amount of data. It is not only economical but also promises higher availability, scalability, and more effective quality of service than in-house solutions. Despite all its benefits, data outsourcing raises serious security concerns for preserving data confidentiality. There are solutions for preserving confidentiality of data while supporting search on the data stored in outsourced environments. However, such solutions do not support access policies to regulate access to a particular subset of the stored data. For complex user management, large enterprises employ Role-Based Access Controls (RBAC) models for making access decisions based on the role in which a user is active in. However, RBAC models cannot be deployed in outsourced environments as they rely on trusted infrastructure in order to regulate access to the data. The deployment of RBAC models may reveal private information about sensitive data they aim to protect. In this paper, we aim at filling this gap by proposing \textbf{$\mathit{ESPOON_{ERBAC}}$} for enforcing RBAC policies in outsourced environments. $\mathit{ESPOON_{ERBAC}}$ enforces RBAC policies in an encrypted manner where a curious service provider may learn a very limited information about RBAC policies. We have implemented $\mathit{ESPOON_{ERBAC}}$ and provided its performance evaluation showing a limited overhead, thus confirming viability of our approach.
CRJun 20, 2013
ESPOON: Enforcing Encrypted Security Policies in Outsourced EnvironmentsMuhammad Rizwan Asghar, Mihaela Ion, Giovanni Russello et al.
The enforcement of security policies in outsourced environments is still an open challenge for policy-based systems. On the one hand, taking the appropriate security decision requires access to the policies. However, if such access is allowed in an untrusted environment then confidential information might be leaked by the policies. Current solutions are based on cryptographic operations that embed security policies with the security mechanism. Therefore, the enforcement of such policies is performed by allowing the authorised parties to access the appropriate keys. We believe that such solutions are far too rigid because they strictly intertwine authorisation policies with the enforcing mechanism. In this paper, we want to address the issue of enforcing security policies in an untrusted environment while protecting the policy confidentiality. Our solution ESPOON is aiming at providing a clear separation between security policies and the enforcement mechanism. However, the enforcement mechanism should learn as less as possible about both the policies and the requester attributes.