Gaurav Varshney

CR
5papers
10citations
Novelty34%
AI Score42

5 Papers

80.8CRApr 19Code
GuardPhish: Securing Open-Source LLMs from Phishing Abuse

Rina Mishra, Gaurav Varshney, Doddipatla Sesha Sahithi

The rapid adoption of open-source Large Language Models (LLMs) in offline and enterprise environments has introduced a largely unexamined security risk like susceptibility to adversarial phishing prompts under static safety configurations. In this work, we systematically investigate this vulnerability through GuardPhish, a large scale multi-vector phishing prompt dataset comprising 70,015 samples spanning web, email, SMS, and voice attack scenarios derived from real world campaigns. Using a deterministic five model ensemble for labeling, we achieve near perfect inter model agreement (Fleiss kappa = 0.9141), with residual disagreements resolved through expert adjudication. By evaluating eight open-source LLMs under fully offline inference conditions, we uncover a substantial enforcement gap like models that correctly identify phishing intent with detection rates up to 96% nevertheless generate actionable phishing content from identical prompts, with attack success rates reaching 98.5% in voice-based scenarios. These findings demonstrate that intent classification alone does not guarantee generative refusal in the absence of dynamic guardrails. To mitigate this risk, we train transformer based classifiers on GuardPhish, achieving up to 98.27% accuracy as modular pre-generation filters deployable without modifying the underlying generative model. Our results highlight a critical weakness in current open-source LLM deployments and provide a reproducible foundation for strengthening defenses against phishing and social engineering attacks.

59.2CRMay 3
Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

Zhiyang Dai, Yansong Gao, Boyu Kuang et al.

Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible, reliance on third-party or internet data is common. Recent studies show CL models are vulnerable to data-poisoning backdoor attacks, but their generalization and robustness are underexplored. We systematically evaluate existing data-poisoning backdoor attacks on CL, revealing limitations: poor dataset adaptability, low success rates, limited portability, and restrictive assumptions (e.g., downstream task knowledge). Interestingly, trigger samples exhibit distinguishable statistical divergence from clean samples, which inspires repurposing it as a watermark for dataset IP protection. Direct repurposing is challenging due to low success rates; we overcome this by statistical verification using a unified density metric. We further propose a multi-level watermarking scheme adapting to feature-level, soft-label, or hard-label outputs in CL. Experiments show some backdoor attacks can be repurposed as effective watermarks with trade-offs among fidelity, verifiability, and robustness. This work demonstrates weak backdoor effects become reliable signals for dataset IP protection in challenging CL settings.

54.0CRApr 1
Jailbreaking Generative AI: Multivector Phishing Threats and Transformer based Defenses

Rina Mishra, Gaurav Varshney

The rise of Generative AI (GenAI) has reshaped the cybersecurity landscape by enabling new attack vectors and lowering the barrier for executing advanced social engineering campaigns. This study conducts an empirical analysis of jailbreaking vulnerabilities in ChatGPT-4o-Mini, showing that novices can bypass safeguards to generate complete multivector phishing attacks across email, web, SMS, and voice channels. Controlled experiments reveal that role-based jailbreaks produce fully operational attack paths capable of credential harvesting. User studies further demonstrate the disruptive potential of GenAI: novice participants exhibited a 240\% increase in perceived phishing competence, a 400\% improvement in task completion rates, and a 57\% reduction in implementation time when assisted by GenAI compared to traditional internet resources. To address these risks, a transformer-based detection framework was developed, achieving an F1-score of 0.9864 (XLNET) for identifying malicious prompts. The work underscores the urgency of strengthening LLM guardrails and provides an annotated dataset to support future defenses.

CRJul 18, 2020
A Comprehensive Survey of Aadhar and Security Issues

Isha Pali, Lisa Krishania, Divya Chadha et al.

The concept of Aadhaar came with the need for a unique identity for every individual. To implement this, the Indian government created the authority UIDAI to distribute and generate user identities for every individual based on their demographic and biometric data. After the implementation, came the security issues and challenges of Aadhaar and its authentication. So, our study focuses on the journey of Aadhaar from its history to the current condition. The paper also describes the authentication process, and the updates happened over time. We have also provided an analysis of the security attacks witnessed so far as well as the possible countermeasure and its classification. Our main aim is to cover all the security aspects related to Aadhaar to avoid possible security attacks. Also, we have included the current updates and news related to Aadhaar.

CRApr 12, 2018
A Metapolicy Framework for Enhancing Domain Expressiveness on the Internet

Gaurav Varshney, Pawel Szalachowski

Domain Name System (DNS) domains became Internet-level identifiers for entities (like companies, organizations, or individuals) hosting services and sharing resources over the Internet. Domains can specify a set of security policies (such as, email and trust security policies) that should be followed by clients while accessing the resources or services represented by them. Unfortunately, in the current Internet, the policy specification and enforcement are dispersed, non-comprehensive, insecure, and difficult to manage. In this paper, we present a comprehensive and secure metapolicy framework for enhancing the domain expressiveness on the Internet. The proposed framework allows the domain owners to specify, manage, and publish their domain-level security policies over the existing DNS infrastructure. The framework also utilizes the existing trust infrastructures (i.e., TLS and DNSSEC) for providing security. By reusing the existing infrastructures, our framework requires minimal changes and requirements for adoption. We also discuss the initial results of the measurements performed to evaluate what fraction of the current Internet can get benefits from deploying our framework. Moreover, overheads of deploying the proposed framework have been quantified and discussed.