An optimized system to solve text-based CAPTCHA
This addresses security vulnerabilities in CAPTCHA systems, but it is incremental as it builds on existing segmentation and classification methods.
The paper tackles the problem of defeating text-based CAPTCHAs by developing a system that segments characters adaptively and uses a hybrid classifier, achieving state-of-the-art performance.
CAPTCHA(Completely Automated Public Turing test to Tell Computers and Humans Apart) can be used to protect data from auto bots. Countless kinds of CAPTCHAs are thus designed, while we most frequently utilize text-based scheme because of most convenience and user-friendly way \cite{bursztein2011text}. Currently, various types of CAPTCHAs need corresponding segmentation to identify single character due to the numerous different segmentation ways. Our goal is to defeat the CAPTCHA, thus firstly the CAPTCHAs need to be split into character by character. There isn't a regular segmentation algorithm to obtain the divided characters in all kinds of examples, which means that we have to treat the segmentation individually. In this paper, we build a whole system to defeat the CAPTCHAs as well as achieve state-of-the-art performance. In detail, we present our self-adaptive algorithm to segment different kinds of characters optimally, and then utilize both the existing methods and our own constructed convolutional neural network as an extra classifier. Results are provided showing how our system work well towards defeating these CAPTCHAs.