CVJul 27, 2023
EnSolver: Uncertainty-Aware Ensemble CAPTCHA Solvers with Theoretical GuaranteesDuc C. Hoang, Behzad Ousat, Amin Kharraz et al.
The popularity of text-based CAPTCHA as a security mechanism to protect websites from automated bots has prompted researches in CAPTCHA solvers, with the aim of understanding its failure cases and subsequently making CAPTCHAs more secure. Recently proposed solvers, built on advances in deep learning, are able to crack even the very challenging CAPTCHAs with high accuracy. However, these solvers often perform poorly on out-of-distribution samples that contain visual features different from those in the training set. Furthermore, they lack the ability to detect and avoid such samples, making them susceptible to being locked out by defense systems after a certain number of failed attempts. In this paper, we propose EnSolver, a family of CAPTCHA solvers that use deep ensemble uncertainty to detect and skip out-of-distribution CAPTCHAs, making it harder to be detected. We prove novel theoretical bounds on the effectiveness of our solvers and demonstrate their use with state-of-the-art CAPTCHA solvers. Our experiments show that the proposed approaches perform well when cracking CAPTCHA datasets that contain both in-distribution and out-of-distribution samples.
LGDec 9, 2024
In-Application Defense Against Evasive Web Scans through Behavioral AnalysisBehzad Ousat, Mahshad Shariatnasab, Esteban Schafir et al.
Web traffic has evolved to include both human users and automated agents, ranging from benign web crawlers to adversarial scanners such as those capable of credential stuffing, command injection, and account hijacking at the web scale. The estimated financial costs of these adversarial activities are estimated to exceed tens of billions of dollars in 2023. In this work, we introduce WebGuard, a low-overhead in-application forensics engine, to enable robust identification and monitoring of automated web scanners, and help mitigate the associated security risks. WebGuard focuses on the following design criteria: (i) integration into web applications without any changes to the underlying software components or infrastructure, (ii) minimal communication overhead, (iii) capability for real-time detection, e.g., within hundreds of milliseconds, and (iv) attribution capability to identify new behavioral patterns and detect emerging agent categories. To this end, we have equipped WebGuard with multi-modal behavioral monitoring mechanisms, such as monitoring spatio-temporal data and browser events. We also design supervised and unsupervised learning architectures for real-time detection and offline attribution of human and automated agents, respectively. Information theoretic analysis and empirical evaluations are provided to show that multi-modal data analysis, as opposed to uni-modal analysis which relies solely on mouse movement dynamics, significantly improves time-to-detection and attribution accuracy. Various numerical evaluations using real-world data collected via WebGuard are provided achieving high accuracy in hundreds of milliseconds, with a communication overhead below 10 KB per second.
CRAug 20, 2021
An Adaptable Deep Learning-Based Intrusion Detection System to Zero-Day AttacksMahdi Soltani, Behzad Ousat, Mahdi Jafari Siavoshani et al.
The intrusion detection system (IDS) is an essential element of security monitoring in computer networks. An IDS distinguishes the malicious traffic from the benign one and determines the attack types targeting the assets of the organization. The main challenge of an IDS is facing new (i.e., zero-day) attacks and separating them from benign traffic and existing types of attacks. Along with the power of the deep learning-based IDSes in auto-extracting high-level features and its independence from the time-consuming and costly signature extraction process, the mentioned challenge still exists in this new generation of IDSes. In this paper, we propose a framework for deep learning-based IDSes addressing new attacks. This framework is the first approach using both deep novelty-based classifiers besides the traditional clustering based on the specialized layer of deep structures, in the security scope. Additionally, we introduce DOC++ as a newer version of DOC as a deep novelty-based classifier. We also employ the Deep Intrusion Detection (DID) framework for the preprocessing phase, which improves the ability of deep learning algorithms to detect content-based attacks. We compare four different algorithms (including DOC, DOC++, OpenMax, and AutoSVM) as the novelty classifier of the framework and use both the CIC-IDS2017 and CSE-CIC-IDS2018 datasets for the evaluation. Our results show that DOC++ is the best implementation of the open set recognition module. Besides, the completeness and homogeneity of the clustering and post-training phase prove that this model is good enough for the supervised labeling and updating phase.